numpy array TypeError: only integer scalar arrays can be converted to a scalar index

kinder chen picture kinder chen · Oct 24, 2017 · Viewed 295.6k times · Source
i=np.arange(1,4,dtype=np.int)
a=np.arange(9).reshape(3,3)

and

a
>>>array([[0, 1, 2],
          [3, 4, 5],
          [6, 7, 8]])
a[:,0:1]
>>>array([[0],
          [3],
          [6]])
a[:,0:2]
>>>array([[0, 1],
          [3, 4],
          [6, 7]])
a[:,0:3]
>>>array([[0, 1, 2],
          [3, 4, 5],
          [6, 7, 8]])

Now I want to vectorize the array to print them all together. I try

a[:,0:i]

or

a[:,0:i[:,None]]

It gives TypeError: only integer scalar arrays can be converted to a scalar index

Answer

Maxim picture Maxim · Jan 13, 2018

Short answer:

[a[:,:j] for j in i]

What you are trying to do is not a vectorizable operation. Wikipedia defines vectorization as a batch operation on a single array, instead of on individual scalars:

In computer science, array programming languages (also known as vector or multidimensional languages) generalize operations on scalars to apply transparently to vectors, matrices, and higher-dimensional arrays.

...

... an operation that operates on entire arrays can be called a vectorized operation...

In terms of CPU-level optimization, the definition of vectorization is:

"Vectorization" (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.

The problem with your case is that the result of each individual operation has a different shape: (3, 1), (3, 2) and (3, 3). They can not form the output of a single vectorized operation, because the output has to be one contiguous array. Of course, it can contain (3, 1), (3, 2) and (3, 3) arrays inside of it (as views), but that's what your original array a already does.

What you're really looking for is just a single expression that computes all of them:

[a[:,:j] for j in i]

... but it's not vectorized in a sense of performance optimization. Under the hood it's plain old for loop that computes each item one by one.