I would like to add certain words to the default stopwords list used in wordcloud. Current code:
all_text = " ".join(rev for rev in twitter_clean.text)
stop_words = ["https", "co", "RT"]
wordcloud = WordCloud(stopwords = stop_words, background_color="white").generate(all_text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
When I use the custom stop_words variable, words such as "is", "was" , and "the" are all interpreted and displayed as high frequency words. However, when I use the default stopwords list (no stopwords argument) then there are many other words that are displayed as highly frequent. How do I add my custom stop_words variable along with the default stopwords list to my wordcloud?
Just append your list to the built-in STOPWORDS list:
From the wordcloud documentation:
stopwords : set of strings or None. The words that will be eliminated. If None, the build-in STOPWORDS list will be used.
So you can simply append STOPWORDS to your custom list and use it
all_text = " ".join(rev for rev in twitter_clean.text)
stop_words = ["https", "co", "RT"] + list(STOPWORDS)
wordcloud = WordCloud(stopwords = stop_words, background_color="white").generate(all_text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()