I don't really understand why regular indexing can't be used for np.delete. What makes np.s_ so special?
For example with this code, used to delete the some of the rows of this array..
inlet_names = np.delete(inlet_names, np.s_[1:9], axis = 0)
Why can't I simply use regular indexing and do..
inlet_names = np.delete(inlet_names, [1:9], axis = 0)
or
inlet_names = np.delete(inlet_names, inlet_names[1:9], axis = 0)
From what I can gather, np.s_ is the same as np.index_exp except it doesn't return a tuple, but both can be used anywhere in Python code.
Then when I look into the np.delete function, it indicates that you can use something like [1,2,3]
to delete those specific indexes along the entire array. So whats preventing me from using something similar to delete certain rows or columns from the array?
I'm simply assuming that this type of indexing is read as something else in np.delete so you need to use np.s_ in order to specify, but I can't get to the bottom of what exactly it would be reading it as because when I try the second piece of code it simply returns "invalid syntax". Which is weird because this code works...
inlet_names = np.delete(inlet_names, [1,2,3,4,5,6,7,8,9], axis = 0)
So I guess the answer could possibly be that np.delete only accepts a list of the indexes that you would like to delete. And that np._s returns a list of the indexes that you specify for the slice.
Just could use some clarification and some corrections on anything I just said about the functions that may be wrong, because a lot of this is just my take, the documents don't exactly explain everything that I was trying to understand. I think I'm just overthinking this, but I would like to actually understand it, if someone could explain it.
np.delete
is not doing anything unique or special. It just returns a copy of the original array with some items missing. Most of the code just interprets the inputs in preparation to make this copy.
What you are asking about is the obj
parameter
obj : slice, int or array of ints
In simple terms, np.s_
lets you supply a slice using the familiar :
syntax. The x:y
notation cannot be used as a function parameter.
Let's try your alternatives (you allude to these in results and errors, but they are buried in the text):
In [213]: x=np.arange(10)*2 # some distinctive values
In [214]: np.delete(x, np.s_[3:6])
Out[214]: array([ 0, 2, 4, 12, 14, 16, 18])
So delete
with s_
removes a range of values, namely 6 8 10
, the 3rd through 5th ones.
In [215]: np.delete(x, [3:6])
File "<ipython-input-215-0a5bf5cc05ba>", line 1
np.delete(x, [3:6])
^
SyntaxError: invalid syntax
Why the error? Because [3:4]
is an indexing expression. np.delete
is a function. Even s_[[3:4]]
has problems. np.delete(x, 3:6)
is also bad, because Python only accepts the :
syntax in an indexing context, where it automatically translates it into a slice
object. Note that is is a syntax error
, something that the interpreter catches before doing any calculations or function calls.
In [216]: np.delete(x, slice(3,6))
Out[216]: array([ 0, 2, 4, 12, 14, 16, 18])
A slice
works instead of s_
; in fact that is what s_
produces
In [233]: np.delete(x, [3,4,5])
Out[233]: array([ 0, 2, 4, 12, 14, 16, 18])
A list also works, though it works in different way (see below).
In [217]: np.delete(x, x[3:6])
Out[217]: array([ 0, 2, 4, 6, 8, 10, 14, 18])
This works, but produces are different result, because x[3:6]
is not the same as range(3,6)
. Also the np.delete
does not work like the list
delete. It deletes by index, not by matching value.
np.index_exp
fails for the same reason that np.delete(x, (slice(3,6),))
does. 1
, [1]
, (1,)
are all valid and remove one item. Even '1'
, the string, works. delete
parses this argument, and at this level, expects something that can be turned into an integer. obj.astype(intp)
. (slice(None),)
is not a slice, it is a 1 item tuple. So it's handled in a different spot in the delete
code. This is TypeError
produced by something that delete
calls, very different from the SyntaxError
. In theory delete
could extract the slice from the tuple and proceed as in the s_
case, but the developers did not choose to consider this variation.
A quick study of the code shows that np.delete
uses 2 distinct copying methods - by slice and by boolean mask. If the obj
is a slice, as in our example, it does (for 1d array):
out = np.empty(7)
out[0:3] = x[0:3]
out[3:7] = x[6:10]
But with [3,4,5]
(instead of the slice) it does:
keep = np.ones((10,), dtype=bool)
keep[[3,4,5]] = False
return x[keep]
Same result, but with a different construction method. x[np.array([1,1,1,0,0,0,1,1,1,1],bool)]
does the same thing.
In fact boolean indexing or masking like this is more common than np.delete
, and generally just as powerful.
From the lib/index_tricks.py
source file:
index_exp = IndexExpression(maketuple=True)
s_ = IndexExpression(maketuple=False)
They are slighly different versions of the same thing. And both are just convenience functions.
In [196]: np.s_[1:4]
Out[196]: slice(1, 4, None)
In [197]: np.index_exp[1:4]
Out[197]: (slice(1, 4, None),)
In [198]: np.s_[1:4, 5:10]
Out[198]: (slice(1, 4, None), slice(5, 10, None))
In [199]: np.index_exp[1:4, 5:10]
Out[199]: (slice(1, 4, None), slice(5, 10, None))
The maketuple
business applies only when there is a single item, a slice or index.