I have a DataFrame, and I want to replace the values in a particular column that exceed a value with zero. I had thought this was a way of achieving this:
df[df.my_channel > 20000].my_channel = 0
If I copy the channel into a new data frame it's simple:
df2 = df.my_channel
df2[df2 > 20000] = 0
This does exactly what I want, but seems not to work with the channel as part of the original DataFrame.
.ix
indexer works okay for pandas version prior to 0.20.0, but since pandas 0.20.0, the .ix
indexer is deprecated, so you should avoid using it. Instead, you can use .loc
or iloc
indexers. You can solve this problem by:
mask = df.my_channel > 20000
column_name = 'my_channel'
df.loc[mask, column_name] = 0
Or, in one line,
df.loc[df.my_channel > 20000, 'my_channel'] = 0
mask
helps you to select the rows in which df.my_channel > 20000
is True
, while df.loc[mask, column_name] = 0
sets the value 0 to the selected rows where mask
holds in the column which name is column_name
.
Update:
In this case, you should use loc
because if you use iloc
, you will get a NotImplementedError
telling you that iLocation based boolean indexing on an integer type is not available.