Top "Data-mining" questions

Data mining is the process of analyzing large amounts of data in order to find patterns and commonalities.

How to test if a kernel is a valid kernel

If I define my own method of determining the similarity between two input entities of my Support Vector Machine classifier, …

machine-learning data-mining svm
scikit-learn DBSCAN memory usage

UPDATED: In the end, the solution I opted to use for clustering my large dataset was one suggested by Anony-Mousse …

python scikit-learn cluster-analysis data-mining dbscan
How would one use Kernel Density Estimation as a 1D clustering method in scikit learn?

I need to cluster a simple univariate data set into a preset number of clusters. Technically it would be closer …

machine-learning scikit-learn cluster-analysis data-mining kernel-density
Removing outliers from a k-mean cluster

I have number of smaller data sets, containing 10 XY coordinates each. I am using Matlab (R2012a)and k-means to …

matlab data-mining cluster-analysis k-means outliers
How exactly does sharkscope or PTR data mine all those hands?

I'm very curious to know how this process works. These sites (http://www.sharkscope.com and http://www.pokertableratings.com) …

data-mining poker
Outlier detection in data mining

I have a few sets of questions regarding outlier detection: Can we find outliers using k-means and is this a …

data-mining svm outliers
Computing F-measure for clustering

Can anyone help me to calculate F-measure collectively ? I know how to calculate recall and precision, but don't know for …

cluster-analysis data-mining precision-recall
Supermarket dataset for Apriori algorithm

'I have to develop a software which is meant for Business Analyst of “Future Stores” Supermarket, the software performs the …

dataset integration-testing data-mining apriori
How to find common phrases in a large body of text

I'm working on a project at the moment where I need to pick out the most common phrases in a …

data-structures graph data-mining text-analysis
What is the difference between a Confusion Matrix and Contingency Table?

I'm writting a piece of code to evaluate my Clustering Algorithm and I find that every kind of evaluation method …

matrix cluster-analysis data-mining difference