Overwriting Nan values with .loc in Pandas

ErnieandBert picture ErnieandBert · Feb 24, 2017 · Viewed 17.2k times · Source

I tried to solve the required task with the following code line:

df['Age'][np.isnan(df["Age"])] = rand1

enter image description here

But this raises a "SettingWithCopyWarning" and I think locating the Nan values in the dataframe (Column 'Age') by using the .loc feature might be a better way of doing this.

I already took a look at the documentation, but still don't know how I can fix this problem. Couldn't find any solutions on here with .loc either.

I would appreciate any hints and advice.

Answer

jezrael picture jezrael · Feb 24, 2017

You need fillna for replace NaN to some value:

df.Age = df.Age.fillna(rand1)

Your solution with loc:

df.loc[np.isnan(df["Age"]), 'Age'] = rand1
#same as
#df.loc[df["Age"].isnull(), 'Age'] = rand1

You can also check indexing view versus copy.

Sample:

df = pd.DataFrame({'Age':[20,23,np.nan]})
print (df)
    Age
0  20.0
1  23.0
2   NaN

rand1 = 30
df.Age = df.Age.fillna(rand1)
print (df)
    Age
0  20.0
1  23.0
2  30.0

#if need cast to int
df.Age = df.Age.fillna(rand1).astype(int)
print (df)
   Age
0   20
1   23
2   30