I just wrote a script that extracts all the spoken text in the Dutch Parlement of a few thousand XML files. For every speaker it count the amount of times a speaker said some words.
After doing this I calculated the TF * IDF value of every word for each speaker in the Dutch Parlement. If you are not familiar with this see this link: TF IDF explanation
So now I have a dictionary for each speaker in the Dutch Parlement where the keys are the words he said and the values are the corresponding TF*IDF values:
{u'asielzoekers': 0.0034861170591325486,
u'belastingverlaging': 0.0018551991553514675,
u'buma': 0.0020712555982839408,
u'islam': 0.0029519544163739155,
u'moslims': 0.0027958002747301355,
u'ouderen': 0.0022803123245457566,
u'pechtold': 0.0021525864470786928,
u'president': 0.003281844532743345,
u'rutte': 0.0023488684001475584,
u'samsom': 0.0019304632325980841}
Right now I want to create a wordcloud from these values. I have shortly looked into the wordcloud module written by amueller But for as far as I can see this module is not working with a dictionary but just plain text.
So any help on how to create a wordcloud from a dictionary's values would be highly appreciated.
Thanks in advance!
dictionary= {u'asielzoekers': 0.0034861170591325486,.. u'samsom': 0.0019304632325980841}
from PIL import Image
import matplotlib.pyplot as plt
from wordcloud import WordCloud
wc = WordCloud(background_color="white",width=1000,height=1000, max_words=10,relative_scaling=0.5,normalize_plurals=False).generate_from_frequencies(dictionary)
plt.imshow(wc)