Data mining is the process of analyzing large amounts of data in order to find patterns and commonalities.
I want to cluster documents based on similarity. I haved tried ssdeep (similarity hashing), very fast but i was told …
nlp cluster-analysis data-mining k-means text-miningI have 1 million 5-dimensional points that I need to group into k clusters with k << 1 million. In each …
algorithm machine-learning cluster-analysis data-mining k-meansPeople often throw around the terms IR, ML, and data mining, but I have noticed a lot of overlap between …
machine-learning data-mining information-retrievalWhat does dimensionality reduction mean exactly? I searched for its meaning, I just found that it means the transformation of …
machine-learning artificial-intelligence data-mining terminologyMatlab, R, and Python are powerful but either costly or slow for some data mining work I'd like to do. …
javascript data-mining scientific-computingWhen I use multiclass.roc function in R (pROC package), for instance, I trained a data set by random forest, …
r data-mining random-forest roc proc-r-packageI'm tryin to use scikit-learn to cluster text documents. On the whole, I find my way around, but I have …
machine-learning scikit-learn cluster-analysis data-mining dbscanI am using the gbm function in R (gbm package) to fit stochastic gradient boosting models for multiclass classification. I …
r machine-learning classification data-mining gbmThis same installation of Weka has loaded for me in the past. I am simply trying to load the Weka …
machine-learning data-mining wekaI want to create own simple recommendation system, about books. But there are some problems - it's impossible (at least, …
data-mining recommendation-engine