HI I am looking for a library that'll remove stop words from text in Javascript
, my end goal is to calculate tf-idf and then convert the given document into vector space, and all of this is Javascript
.
Can anyone point me to a library that'll help me do that.Just a library to remove the stop words would also be great.
Use the stopwords provided by the NLTK library:
stopwords = ['i','me','my','myself','we','our','ours','ourselves','you','your','yours','yourself','yourselves','he','him','his','himself','she','her','hers','herself','it','its','itself','they','them','their','theirs','themselves','what','which','who','whom','this','that','these','those','am','is','are','was','were','be','been','being','have','has','had','having','do','does','did','doing','a','an','the','and','but','if','or','because','as','until','while','of','at','by','for','with','about','against','between','into','through','during','before','after','above','below','to','from','up','down','in','out','on','off','over','under','again','further','then','once','here','there','when','where','why','how','all','any','both','each','few','more','most','other','some','such','no','nor','not','only','own','same','so','than','too','very','s','t','can','will','just','don','should','now']
Then simply pass your string into the following function:
function remove_stopwords(str) {
res = []
words = str.split(' ')
for(i=0;i<words.length;i++) {
word_clean = words[i].split(".").join("")
if(!stopwords.includes(word_clean)) {
res.push(word_clean)
}
}
return(res.join(' '))
}
Example:
remove_stopwords("I will go to the place where there are things for me.")
Result:
I go place things
Just add any words to your NLTK array that aren't covered already.