Top "Text-mining" questions

Text Mining is a process of deriving high-quality information from unstructured (textual) information.

What is "entropy and information gain"?

I am reading this book (NLTK) and it is confusing. Entropy is defined as: Entropy is the sum of the …

math text computer-science nltk text-mining
Using Sklearn's TfidfVectorizer transform

I am trying to get the tf-idf vector for a single document using Sklearn's TfidfVectorizer object. I create a vocabulary …

python document text-mining tf-idf
list of word frequencies using R

I have been using the tm package to run some text analysis. My problem is with creating a list with …

r text-mining word-frequency term-document-matrix
R tm package invalid input in 'utf8towcs'

I'm trying to use the tm package in R to perform some text analysis. I tied the following: require(tm) …

r utf-8 iconv text-mining
R-Project no applicable method for 'meta' applied to an object of class "character"

I am trying to run this code (Ubuntu 12.04, R 3.1.1) # Load requisite packages library(tm) library(ggplot2) library(lsa) # Place Enron …

r text-mining tm
AttributeError: 'GridSearchCV' object has no attribute 'cv_results_'

I try to apply this code : pipe = make_pipeline(TfidfVectorizer(min_df=5), LogisticRegression()) param_grid = {'logisticregression__C': [ 0.001, 0.01, 0.1, 1, 10, 100], "tfidfvectorizer__ngram_range": [(1, 1),(1, 2),(1, 3)]} …

python machine-learning scikit-learn text-mining
R text file and text mining...how to load data

I am using the R package tm and I want to do some text mining. This is one document and …

r load text-mining tm
Text-mining with the tm-package - word stemming

I am doing some text mining in R with the tm-package. Everything works very smooth. However, one problem occurs after …

r text-mining tm
Adding custom stopwords in R tm

I have a Corpus in R using the tm package. I am applying the removeWords function to remove stopwords tm_…

r text-mining stop-words corpus tm