Passing and returning numpy arrays to C++ methods via Cython

Michael Schubert picture Michael Schubert · Jul 25, 2013 · Viewed 11.4k times · Source

There are lots of questions about using numpy in cython on this site, a particularly useful one being Simple wrapping of C code with cython.

However, the cython/numpy interface api seems to have changed a bit, in particular with ensuring the passing of memory-contiguous arrays.

What is the best way to write a wrapper function in cython that:

  • takes a numpy array that is likely but not necessarily contiguous
  • calls a C++ class method with the signature double* data_in, double* data_out
  • returns a numpy array of the double* that the method wrote to?

My try is below:

cimport numpy as np
import numpy as np # as suggested by jorgeca

cdef extern from "myclass.h":
    cdef cppclass MyClass:
        MyClass() except +
        void run(double* X, int N, int D, double* Y)

def run(np.ndarray[np.double_t, ndim=2] X):
    cdef int N, D
    N = X.shape[0]
    D = X.shape[1]

    cdef np.ndarray[np.double_t, ndim=1, mode="c"] X_c
    X_c = np.ascontiguousarray(X, dtype=np.double)

    cdef np.ndarray[np.double_t, ndim=1, mode="c"] Y_c
    Y_c = np.ascontiguousarray(np.zeros((N*D,)), dtype=np.double)

    cdef MyClass myclass
    myclass = MyClass()
    myclass.run(<double*> X_c.data, N, D, <double*> Y_c.data)

    return Y_c.reshape(N, 2)

This code compiles but is not necessarily optimal. Do you have any suggestions on improving the snippet above?

and (2) throws and "np is not defined on line X_c = ...") when calling it at runtime. The exact testing code and error message are the following:

import numpy as np
import mywrapper
mywrapper.run(np.array([[1,2],[3,4]], dtype=np.double))

# NameError: name 'np' is not defined [at mywrapper.pyx":X_c = ...]
# fixed!

Answer

Robert T. McGibbon picture Robert T. McGibbon · Aug 11, 2013

You've basically got it right. First, hopefully optimization shouldn't be a big deal. Ideally, most of the time is spent inside your C++ kernel, not in the cythnon wrapper code.

There are a few stylistic changes you can make that will simplify your code. (1) Reshaping between 1D and 2D arrays is not necessary. When you know the memory layout of your data (C-order vs. fortran order, striding, etc), you can see the array as just a chunk of memory that you're going to index yourself in C++, so numpy's ndim doesn't matter on the C++ side -- it's just seeing that pointer. (2) Using cython's address-of operator &, you can get the pointer to the start of the array in a little cleaner way -- no explicit cast necessary -- using &X[0,0].

So this is my edited version of your original snippet:

cimport numpy as np
import numpy as np

cdef extern from "myclass.h":
    cdef cppclass MyClass:
        MyClass() except +
        void run(double* X, int N, int D, double* Y)

def run(np.ndarray[np.double_t, ndim=2] X):
    X = np.ascontiguousarray(X)
    cdef np.ndarray[np.double_t, ndim=2, mode="c"] Y = np.zeros_like(X)

    cdef MyClass myclass
    myclass = MyClass()
    myclass.run(&X[0,0], X.shape[0], X.shape[1], &Y[0,0])

    return Y