Passing 3-dimensional numpy array to C

DaveTheScientist picture DaveTheScientist · Mar 22, 2013 · Viewed 7.9k times · Source

I'm writing a C extension to my Python program for speed purposes, and running into some very strange behaviour trying to pass in a 3-dimensional numpy array. It works with a 2-dimensional array, but I'm sure I'm screwing something up with the pointers trying to get it to work with the 3rd dimension. But here's the weird part. If I just pass in a 3-D array, it crashes with a Bus Error. If (in Python) I create my variable as a 2D array first, and then overwrite it with a 3D array, it works perfectly. If the variable is an empty array first and then a 3D array, it crashes with a Seg Fault. How can that possibly happen?

Also, can anyone help me get a 3D array working? Or should I just give up and pass in a 2D array and reshape it myself?

Here's my C code:

static PyObject* func(PyObject* self, PyObject* args) {
  PyObject *list2_obj;
  PyObject *list3_obj;
  if (!PyArg_ParseTuple(args, "OO", &list2_obj, &list3_obj))
    return NULL;

  double **list2;
  double ***list3;

  //Create C arrays from numpy objects:
  int typenum = NPY_DOUBLE;
  PyArray_Descr *descr;
  descr = PyArray_DescrFromType(typenum);
  npy_intp dims[3];
  if (PyArray_AsCArray(&list2_obj, (void **)&list2, dims, 2, descr) < 0 || PyArray_AsCArray(&list3_obj, (void ***)&list3, dims, 3, descr) < 0) {
    PyErr_SetString(PyExc_TypeError, "error converting to c array");
    return NULL;
  }
  printf("2D: %f, 3D: %f.\n", list2[3][1], list3[1][0][2]);
}

And here's my Python code that calls the above function:

import cmod, numpy
l2 = numpy.array([[1.0,2.0,3.0], [4.0,5.0,6.0], [7.0,8.0,9.0], [3.0, 5.0, 0.0]])

l3 = numpy.array([[2,7, 1], [6, 3, 9], [1, 10, 13], [4, 2, 6]])  # Line A
l3 = numpy.array([])                                             # Line B

l3 = numpy.array([[[2,7, 1, 11], [6, 3, 9, 12]],
                 [[1, 10, 13, 15], [4, 2, 6, 2]]])

cmod.func(l2, l3)

So, if I comment out both Line A and B, it crashes with a Bus error. If Line A is there, but Line B is commented out, it runs correctly with no errors. If Line B is there but Line A is commented out, it prints the correct numbers but then Seg faults. Finally if both lines are present it also prints the correct numbers and then Seg faults. What in the hell is going on here?

EDIT: Ok. Wow. So I was using int in Python but calling them double in C. And that was working fine with 1D and 2D arrays. But not 3D. So I changed the Python definition of l3 to be floats, and now it all works fantastically (Thank you very much Bi Rico).

But now, more strange behaviour with Lines A & B! Now if both Lines are commented out, the program works. If Line B is present but A is commented out, it works, and ditto if both are uncommented. But if Line A is present and B is commented out, I get that fantastic Bus error again. I'd really like to avoid these in the future, so does anyone have any clue why the declaration of a Python variable can have this kind of impact?

EDIT 2: Well, as insane as these errors are, they're all due to the 3-dimensional numpy array I pass in. If I only pass in 1- or 2-D arrays, it behaves as expected, and manipulation of the other Python variables does nothing. This leads me to believe that the problem lies somewhere in Python's reference counting. In the C-code the reference count is decreased more than it should for the 3-D arrays, and when that function returns Python tries to clean up objects, and attempts to delete a NULL pointer. This is just my guess, and I've tried to Py_INCREF(); everything I could think of to no avail. I guess I'll just be using a 2D array and reshaping it in C.

Answer

user1149913 picture user1149913 · Mar 22, 2013

Rather than converting to a c-style array, I usually access numpy array elements directly using PyArray_GETPTR (see http://docs.scipy.org/doc/numpy/reference/c-api.array.html#data-access).

For instance, to access an element of a 3-dimensional numpy array of type double use double elem=*((double *)PyArray_GETPTR3(list3_obj,i,j,k)).

For your application, you could detect the correct number of dimensions for each array using PyArray_NDIM, then access elements using the appropriate version of PyArray_GETPTR.