How to add extra stop words in addition to default stopwords in wordcloud?

Command picture Command · Jan 1, 2019 · Viewed 10.1k times · Source

I would like to add certain words to the default stopwords list used in wordcloud. Current code:

all_text = " ".join(rev for rev in twitter_clean.text)
stop_words = ["https", "co", "RT"]
wordcloud = WordCloud(stopwords = stop_words, background_color="white").generate(all_text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

When I use the custom stop_words variable, words such as "is", "was" , and "the" are all interpreted and displayed as high frequency words. However, when I use the default stopwords list (no stopwords argument) then there are many other words that are displayed as highly frequent. How do I add my custom stop_words variable along with the default stopwords list to my wordcloud?

Answer

HakunaMaData picture HakunaMaData · Jan 1, 2019

Just append your list to the built-in STOPWORDS list:

From the wordcloud documentation:

stopwords : set of strings or None. The words that will be eliminated. If None, the build-in STOPWORDS list will be used.

So you can simply append STOPWORDS to your custom list and use it

all_text = " ".join(rev for rev in twitter_clean.text)
stop_words = ["https", "co", "RT"] + list(STOPWORDS)
wordcloud = WordCloud(stopwords = stop_words, background_color="white").generate(all_text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()