I'm wondering where I can find the full list of supported langs (and their keys) for the NLTK stopwords.
I find a list in https://pypi.org/project/stop-words/ but it does not contain the keys for each country. So, it is not clear if you can retrieve the list by simply stopwords.words("Bulgarian")
. In fact, that will throw an error.
I checked in the NLTK site and there are 4 documents matching "stopwords" but none of them describes that. https://www.nltk.org/search.html?q=stopwords&check_keywords=yes&area=default
And nothing is sayd in their book: http://www.nltk.org/book/ch02.html#stopwords_index_term
So, do you know where can I find the list of keys?
os.listdir('/root/nltk_data/corpora/stopwords/')
['hungarian',
'swedish',
'kazakh',
'norwegian',
'finnish',
'arabic',
'indonesian',
'portuguese',
'turkish',
'azerbaijani',
'slovene',
'spanish',
'danish',
'nepali',
'romanian',
'greek',
'dutch',
'README',
'tajik',
'german',
'english',
'russian',
'french',
'italian']