Top "Tokenize" questions

Tokenizing is the act of splitting a string into discrete elements called tokens.

NLTK tokenize - faster way?

I have a method that takes in a String parameter, and uses NLTK to break the String down to sentences, …

python time-complexity nltk tokenize frequency
Difference between StandardTokenizerFactory and KeywordTokenizerFactory in Solr?

I am new to Solr.I want to know when to use StandardTokenizerFactory and KeywordTokenizerFactory? I read the docs on …

java solr solrnet tokenize
What are some practical uses of PHP tokenizer?

What are practical and day-to-day usage examples of PHP Tokenizer ? Has anyone used this?

php tokenize
What is more efficient a switch case or an std::map

I'm thinking about the tokenizer here. Each token calls a different function inside the parser. What is more efficient: A …

c++ parsing tokenize
get the last token of a string in C

what I want to do is given an input string, which I will not know it's size or the number …

c tokenize strtok
Add multiValued field to a SolrInputDocument

We are using a solr embeded instance for Java SolrJ. I want to add a multivalued field to a document. …

java tokenize solrj
How do I tokenize this string in Ruby?

I have this string: %{Children^10 Health "sanitation management"^5} And I want to convert it to tokenize this into an array …

ruby parsing tokenize text-parsing
C - Determining which delimiter used - strtok()

Let's say I'm using strtok() like this.. char *token = strtok(input, ";-/"); Is there a way to figure out which …

c tokenize strtok
How to parse / tokenize an SQL statement in Node.js

I'm looking for a way to parse / tokenize SQL statement within a Node.js application, in order to: Tokenize all …

sql node.js parsing tokenize sql-parser
Python re.split() vs nltk word_tokenize and sent_tokenize

I was going through this question. Am just wondering whether NLTK would be faster than regex in word/sentence tokenization.

python regex nlp nltk tokenize