I'm trying to create a wordcloud from csv file. The csv file, as an example, has the following structure:
a,1
b,2
c,4
j,20
It has more rows, more or less 1800. The first column has string values (names) and the second column has their respective frequency (int). Then, the file is read and the key,value row is stored in a dictionary (d) because later on we will use this to plot the wordcloud:
reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
d[k] = v
Once we have the dictionary full of values, I try to plot the wordcloud:
#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
But an error is thrown:
Traceback (most recent call last):
File ".........../script.py", line 19, in <module>
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line 360, in generate_from_frequencies
for word, freq in frequencies]
File "/usr/local/lib/python3.5/dist-packages/wordcloud/wordcloud.py", line 360, in <listcomp>
for word, freq in frequencies]
TypeError: unsupported operand type(s) for /: 'str' and 'float
Finally, the documentation says:
def generate_from_frequencies(self, frequencies, max_font_size=None):
"""Create a word_cloud from words and frequencies.
Parameters
----------
frequencies : dict from string to float
A contains words and associated frequency.
max_font_size : int
Use this font-size instead of self.max_font_size
Returns
-------
self
So, I don't understand why is trowing me this error if I met the requirements of the function. I hope someone can help me, thanks.
Note
I work with worldcloud 1.3.1
This is because the values in your dictionary are strings but wordcloud expects integer or floats.
After I run your code then inspect your dictionary d
I get the following.
In [12]: d
Out[12]: {'a': '1', 'b': '2', 'c': '4', 'j': '20'}
Note the ' '
around the numbers means these are really strings.
A hacky way to resolve this is to cast v
to an int
in your FOR
loop like:
d[k] = int(v)
I say this is hacky since it'll work on integers but if you have floats in your input then it may cause problems.
Also, Python errors can be difficult to read. Your error above can be interpreted as
script.py", line 19
TypeError: unsupported operand type(s) for /: 'str' and 'float
"There's a type error on or before line 19 of my file. Let me look at my data types to see if there is any mismatch between string and float..."
The code below works for me:
import csv
from wordcloud import WordCloud
import matplotlib.pyplot as plt
reader = csv.reader(open('namesDFtoCSV', 'r',newline='\n'))
d = {}
for k,v in reader:
d[k] = int(v)
#Generating wordcloud. Relative scaling value is to adjust the importance of a frequency word.
#See documentation: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py
wordcloud = WordCloud(width=900,height=500, max_words=1628,relative_scaling=1,normalize_plurals=False).generate_from_frequencies(d)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()