Popular "tokenize" questions | Page 8

I keep getting this error sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer when …

python nltk tokenize stop-words

I am new to Spacy and NLP. Facing the below issue while doing sentence segmentation using Spacy. The text I …

nlp tokenize spacy sentence

Searching for names(text) with spaces in it, causing problem to me, I have mapping similar to "{"user":{"properties":{"name":{"…

search elasticsearch tokenize analyzer

I need to split a string and extract words separated by whitespace characters.The source may be in English or …

text unicode whitespace tokenize cjk

I use java.util.StringTokenizer for simple parsing of delimited strings in java. I have a need for the same …

sql oracle plsql tokenize stringtokenizer

I'm working on my first Python project and have reasonably large dataset (10's of thousands of rows). I need to …

python python-3.x pandas tokenize spacy

I have a huge file with data (~8Gb / ~80 Million records). Every record has 6-8 attributes which are split by a …

java tokenize stringtokenizer

What I want to be able to do is perform a query and get results back that are not case …

solr tokenize

Basically I want to remove all whitespaces and tokenize the whole string as a single token. (I will use nGram …

elasticsearch whitespace tokenize removing-whitespace

I would like to know if there is a method using boost::split to split a string using whole strings …

c++ string boost tokenize

Top "Tokenize" questions