I'm trying to get the average price of products containing any substrings from a wordlist from a dataframe. I've been able to do so with the following code on multiple spreadsheets -
dframe['Product'].fillna('', inplace=True)
dframe['Price'].fillna(0, inplace=True)
total_count = 0
total_price = 0
for word in ransomware_wordlist:
mask = dframe.Product.str.contains(word, case=False)
total_count += mask.sum()
total_price += dframe.loc[mask, 'Price'].sum()
average_price = total_price / total_count
print(average_price)
However, one of the spreadsheets throws an error at line -
dframe['Product'].fillna('', inplace=True)
with
ValueError: cannot index with vector containing NA / NaN values
I fail to understand why dframe['Product'].fillna('', inplace=True)
isn't handling this problem.
In desperate need of some help! Thanks!
If first line failed still is possible replace NaN
s in condition in str.contains
by parameter na=False
:
mask = dframe.Product.str.contains(word, case=False, na=False)
Or try omit inplace=True
and assign back:
dframe['Product'] = dframe['Product'].fillna('')