Python - ValueError: Cannot index with vector containing NA / NaN values

harry04 picture harry04 · Feb 19, 2018 · Viewed 14.5k times · Source

I'm trying to get the average price of products containing any substrings from a wordlist from a dataframe. I've been able to do so with the following code on multiple spreadsheets -

dframe['Product'].fillna('', inplace=True)
dframe['Price'].fillna(0, inplace=True)
total_count = 0
total_price = 0
for word in ransomware_wordlist:
    mask = dframe.Product.str.contains(word, case=False)
    total_count += mask.sum()
    total_price += dframe.loc[mask, 'Price'].sum()
average_price = total_price / total_count
print(average_price)

However, one of the spreadsheets throws an error at line -

dframe['Product'].fillna('', inplace=True)

with

ValueError: cannot index with vector containing NA / NaN values

I fail to understand why dframe['Product'].fillna('', inplace=True) isn't handling this problem.

In desperate need of some help! Thanks!

Answer

jezrael picture jezrael · Feb 19, 2018

If first line failed still is possible replace NaNs in condition in str.contains by parameter na=False:

mask = dframe.Product.str.contains(word, case=False, na=False)

Or try omit inplace=True and assign back:

dframe['Product'] = dframe['Product'].fillna('')