At first, I tried writing some code that looked like this:
import numpy as np
import pandas as pd
np.random.seed(2016)
train = pd.DataFrame(np.random.choice([np.nan, 1, 2], size=(10, 3)),
columns=['Age', 'SibSp', 'Parch'])
complete = train.dropna()
complete['AgeGt15'] = complete['Age'] > 15
After getting SettingWithCopyWarning, I tried using.loc:
complete.loc[:, 'AgeGt15'] = complete['Age'] > 15
complete.loc[:, 'WithFamily'] = complete['SibSp'] + complete['Parch'] > 0
However, I still get the same warning. What gives?
Note: As of pandas version 0.24, is_copy
is deprecated and will be removed in a future version. While the private attribute _is_copy
exists, the underscore indicates this attribute is not part of the public API and therefore should not be depended upon. Therefore, going forward, it seems the only proper way to silence SettingWithCopyWarning
will be to do so globally:
pd.options.mode.chained_assignment = None
When complete = train.dropna()
is executed, dropna
might return a copy, so
out of an abundance of caution, Pandas sets complete.is_copy
to a Truthy
value:
In [220]: complete.is_copy
Out[220]: <weakref at 0x7f7f0b295b38; to 'DataFrame' at 0x7f7eee6fe668>
This allows Pandas to warn you later, when complete['AgeGt15'] = complete['Age'] > 15
is executed that you may be modifying a copy which will have no effect on train
. For beginners this may be a useful warning. In your case, it appears you have no intention of modifying train
indirectly by modifying complete
. Therefore the warning is just a meaningless annoyance in your case.
You can silence the warning by setting,
complete.is_copy = False # deprecated as of version 0.24
This is quicker than making an actual copy, and nips the SettingWithCopyWarning
in the bud (at the point where _check_setitem_copy
is called):
def _check_setitem_copy(self, stacklevel=4, t='setting', force=False):
if force or self.is_copy:
...
If you are really confident you know what you are doing, you can shut off the SettingWithCopyWarning
globally with
pd.options.mode.chained_assignment = None # None|'warn'|'raise'
An alternative way to silence the warning is to make a new copy:
complete = complete.copy()
However, you may not want to do this if the DataFrame is large, since copying
can take a significant amount of time and memory, and it is
completely pointless (except for the sake of silencing a warning) if you know complete
is already a copy.