Top "Data-mining" questions

Data mining is the process of analyzing large amounts of data in order to find patterns and commonalities.

DBSCAN error with cosine metric in python

I was trying to use DBSCAN algorithm from scikit-learn library with cosine metric but was stuck with the error. The …

scikit-learn cluster-analysis data-mining cosine-similarity dbscan
TFIDF calculating confusion

I found the following code on the internet for calculating TFIDF: https://github.com/timtrueman/tf-idf/blob/master/tf-idf.py …

python data-mining text-processing information-retrieval tf-idf
How to deal with missing attribute values in C4.5 (J48) decision tree?

What's the best way to handle missing feature attribute values with Weka's C4.5 (J48) decision tree? The problem of missing …

machine-learning data-mining weka decision-tree classification
Cosine distance as vector distance function for k-means

I have a graph of N vertices where each vertex represents a place. Also I have vectors, one per user, …

cluster-analysis data-mining distance k-means cosine-similarity
How to find out if a sentence is a question (interrogative)?

Is there an open source Java library/algorithm for finding if a particular piece of text is a question or …

java algorithm nlp data-mining text-processing
Effects of Stemming on the term frequency?

How are the term frequencies (TF), and inverse document frequency (IDF), affected by stop-word removal and stemming? Thanks!

data-mining text-processing tf-idf stop-words stemming
What are data requirements for FP-Growth in Weka?

I'd like to use FP-Growth association rule algorithm on my dataset (model) in Weka. Unfortunately, this algorithm is greyed out. …

java data-mining weka
DBSCAN for clustering data by location and density

I'm using the method dbscan::dbscan in order to cluster my data by location and density. My data looks like …

r machine-learning cluster-analysis data-mining dbscan
Download link for Ta Feng Grocery dataset

I am desperately trying to download the Ta-Feng grocery dataset for few days but appears that all links are broken. …

machine-learning dataset data-mining
Test and Training Set are Not Compatible

I have seen various articles about the same issue, Tried a lot of solutions and nothing is working. Kindly advice. …

csv data-mining weka test-data training-data