Finding the most popular words in a list

Maciej Ziarko picture Maciej Ziarko · Mar 9, 2011 · Viewed 16.7k times · Source

I have a list of words:

words = ['all', 'awesome', 'all', 'yeah', 'bye', 'all', 'yeah']

And I want to get a list of tuples:

[(3, 'all'), (2, 'yeah'), (1, 'bye'), (1, 'awesome')]

where each tuple is...

(number_of_occurrences, word)

The list should be sorted by the number of occurrences.

What I've done so far:

def popularWords(words):
    dic = {}
    for word in words:
        dic.setdefault(word, 0)
        dic[word] += 1
    wordsList = [(dic.get(w), w) for w in dic]
    wordsList.sort(reverse = True)
    return wordsList

The question is...

Is it Pythonic, elegant and efficient? Are you able to do it better? Thanks in advance.

Answer

SiggyF picture SiggyF · Mar 9, 2011

You can use the counter for this.

import collections
words = ['all', 'awesome', 'all', 'yeah', 'bye', 'all', 'yeah']
counter = collections.Counter(words)
print(counter.most_common())
>>> [('all', 3), ('yeah', 2), ('bye', 1), ('awesome', 1)]

It gives the tuple with reversed columns.

From the comments: collections.counter is >=2.7,3.1. You can use the counter recipe for lower versions.