Top "N-gram" questions

An N-gram is an ordered collection of N elements of the same kind, usually presented in a large collection of many other similar N-grams.

n-grams in python, four, five, six grams?

I'm looking for a way to split a text into n-grams. Normally I would do something like: import nltk from …

python string nltk n-gram
Computing N Grams using Python

I needed to compute the Unigrams, BiGrams and Trigrams for a text file containing text like: "Cystic fibrosis affects 30,000 children …

python nlp nltk n-gram
counting n-gram frequency in python nltk

I have the following code. I know that I can use apply_freq_filter function to filter out collocations that …

python nltk n-gram
N-gram generation from a sentence

How to generate an n-gram of a string like: String Input="This is my car." I want to generate n-gram …

java lucene nlp n-gram
Python: Reducing memory usage of dictionary

I'm trying to load a couple of files into the memory. The files have either of the following 3 formats: string …

python memory dictionary compression n-gram
Python NLTK: Bigrams trigrams fourgrams

I have this example and i want to know how to get this result. I have text and I tokenize …

python nltk n-gram
Generate bigrams with NLTK

I am trying to produce a bigram list of a given sentence for example, if I type, To be or …

python nltk n-gram
Simple implementation of N-Gram, tf-idf and Cosine similarity in Python

I need to compare documents stored in a DB and come up with a similarity score between 0 and 1. The method …

python document n-gram tf-idf vsm
Elasticsearch: Find substring match

I want to perform both exact word match and partial word/substring match. For example if I search for "men's …

autocomplete elasticsearch substring stringtokenizer n-gram
Understanding the `ngram_range` argument in a CountVectorizer in sklearn

I'm a little confused about how to use ngrams in the scikit-learn library in Python, specifically, how the ngram_range …

python scikit-learn n-gram feature-selection