Top "Stop-words" questions

Stop words are words that are filtered out prior (or after) the processing of natural language data.

Getting rid of stop words and document tokenization using NLTK

I’m having difficulty eliminating and tokenizing a .text file using nltk. I keep getting the following AttributeError: 'list' object …

python nltk tokenize stop-words
adding words to stop_words list in TfidfVectorizer in sklearn

I want to add a few more words to stop_words in TfidfVectorizer. I followed the solution in Adding words …

python scikit-learn classification stop-words text-classification
Removing stopwords from a String in Java

I have a string with lots of words and I have a text file which contains some Stopwords which I …

java string stop-words
What is the default list of stopwords used in Lucene's StopFilter?

Lucene have a default stopfilter (http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/core/StopFilter.html), does anyone …

java apache lucene information-retrieval stop-words
SQL 2008: Turn off Stop Words for Full Text Search Query

I'm having quite a bit of difficulty finding a good solution for this: Let's say I have a table of "…

sql-server-2008 full-text-search stop-words
Java Arraylist remove multiple element by index

Here is my code: for (int i = 0; i < myarraylist.size(); i++) { for (int j = 0; j < stopwords.size(); j++) { …

java android arraylist stop-words
"Stop words" list for English?

I'm generating some statistics for some English-language text and I would like to skip uninteresting words such as "a" and "…

language-agnostic indexing filtering stop-words nlp
Adding words to scikit-learn's CountVectorizer's stop list

Scikit-learn's CountVectorizer class lets you pass a string 'english' to the argument stop_words. I want to add some things …

python scikit-learn stop-words
Extract Relevant Tag/Keywords from Text block

I wanted a particular implementation, such that the user provide a block of text like: "Requirements - Working knowledge, on …

php javascript tags stop-words
How to remove list of words from a list of strings

Sorry if the question is bit confusing. This is similar to this question I think this the above question is …

python regex list-comprehension stop-words