Top "N-gram" questions

An N-gram is an ordered collection of N elements of the same kind, usually presented in a large collection of many other similar N-grams.

Java Lucene NGramTokenizer

I am trying tokenize strings into ngrams. Strangely in the documentation for the NGramTokenizer I do not see a method …

java lucene tokenize n-gram
How to compute skipgrams in python?

A k skipgram is an ngram which is a superset of all ngrams and each (k-i )skipgram till (k-i)==0 (which …

python nlp n-gram language-model
R and tm package: create a term-document matrix with a dictionary of one or two words?

Purpose: I want to create a term-document matrix using a dictionary which has compound words, or bigrams, as some of …

r tm n-gram term-document-matrix rweka
N-grams: Explanation + 2 applications

I want to implement some applications with n-grams (preferably in PHP). Which type of n-grams is more adequate for most …

php nlp analysis n-gram
Is there a more efficient way to find most common n-grams?

I'm trying to find k most common n-grams from a large corpus. I've seen lots of places suggesting the naï…

algorithm nlp n-gram
How to get n-gram collocations and association in python nltk?

In this documentation, there is example using nltk.collocations.BigramAssocMeasures(), BigramCollocationFinder,nltk.collocations.TrigramAssocMeasures(), and TrigramCollocationFinder. There is example method …

python nlp nltk n-gram collocation
n-gram sentence similarity with cosine similarity measurement

I have been working on a project about sentence similarity. I know it has been asked many times in SO, …

similarity trigonometry n-gram
The n-gram that is the most frequent one among all the words

I came across the following programming interview problem: Challenge 1: N-grams An N-gram is a sequence of N consecutive characters from …

c algorithm n-gram
Auto completion search with Solr using NGrams

I'm working on auto completion search with Solr using EdgeNGrams. If the user is searching for names of employees, then …

autocomplete solr n-gram
ElasticSearch n-gram tokenfilter not finding partial words

I have been playing around with ElasticSearch for a new project of mine. I have set the default analyzers to …

n-gram elasticsearch