Pandas: SettingWithCopyWarning

Jason picture Jason · Apr 11, 2014 · Viewed 34.7k times · Source

I'd like to replace values in a Pandas DataFrame larger than an arbitrary number (100 in this case) with NaN (as values this large are indicative of a failed experiment). Previously I've used this to replace unwanted values:

sve2_all[sve2_all[' Hgtot ng/l'] > 100] = np.nan

However, I got the following error:

-c:3: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
C:\Users\AppData\Local\Enthought\Canopy32\User\lib\site-packages\pandas\core\indexing.py:346: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
self.obj[item] = s

From this StackExchange question, it seems that sometimes this warning can be ignored, but I can't follow the discussion well enough to be certain whether this applies to my situation. Is the warning basically letting me know that I'll be overwriting some of the values in my DataFrame?

Edit: As far as I can tell, everything behaved as it should. As a follow up is my method of replacing values non-standard? Is there a better way to replace values?

Answer

Andy Hayden picture Andy Hayden · Apr 11, 2014

As suggested in the error message, you should use loc to do this:

sve2_all.loc[sve2_all['Hgtot ng/l'] > 100] = np.nan

The warning is here to stop you modifying a copy (here sve2_all[sve2_all[' Hgtot ng/l'] > 100] is potentially a copy, and if it is then any modifications would not change the original frame. It could be that it works correctly in some cases but pandas cannot guarantee it will work in all cases... use at your own risk (consider yourself warned! ;) ).