Make Frequency histogram from list with tuple elements

jjn picture jjn · Jul 13, 2017 · Viewed 8k times · Source

I would like to make a word frequency distribution, with the words on the x-axis and the frequency count on the y-axis.

I have the following list:

example_list = [('dhr', 17838), ('mw', 13675), ('wel', 5499), ('goed', 5080), 
                ('contact', 4506), ('medicatie', 3797), ('uur', 3792),
                ('gaan', 3473), ('kwam', 3463), ('kamer', 3447), 
                ('mee', 3278), ('gesprek', 2978)] 

I tried to first convert it into a pandas DataFrame and then use the pd.hist() as in the example below, but I just can't figure it out and think it is actually straight forward but probably I'm missing something.

import numpy as np
import matplotlib.pyplot as plt

word = []
frequency = []

for i in range(len(example_list)):
  word.append(example_list[i][0])
  frequency.append(example_list[i][1])


plt.bar(word, frequency, color='r')
plt.show()

Answer

MSeifert picture MSeifert · Jul 13, 2017

You can't pass the words into the matplotlib.pyplot.bar directly. However you could create an indices array for bar and then replace these indices with the words using matplotlib.pyplot.xticks:

import numpy as np
import matplotlib.pyplot as plt

indices = np.arange(len(example_list))
plt.bar(indices, frequency, color='r')
plt.xticks(indices, word, rotation='vertical')
plt.tight_layout()
plt.show()

enter image description here

The for-loop to create word and frequency could also be replaced by a simple zip und list unpacking:

word, frequency = zip(*example_list)