Top "Classification" questions

In machine learning and statistics, classification is the problem of identifying which of a set of categories a new observation belongs to, on the basis of a training set of data containing observations whose category membership (label) is known.

How can I know training data is enough for machine learning

For example: If I want to train a classifier (maybe SVM), how many sample do I need to collect? Is …

machine-learning classification sample-data
I want a machine to learn to categorize short texts

I have a ton of short stories about 500 words long and I want to categorize them into one of, let's …

machine-learning nlp classification
Predict probabilities using SVM

I wrote this code and wanted to obtain probabilities of classification. from sklearn import svm X = [[0, 0], [10, 10],[20,30],[30,30],[40, 30], [80,60], [80,50]] y = [0, 1, 2, 3, 4, 5, 6] clf = svm.SVC() …

python classification svm libsvm
Creating an ARFF file from python output

gardai-plan-crackdown-on-troublemakers-at-protest-2438316.html': {'dail': 1, 'focus': 1, 'actions': 1, 'trade': 2, 'protest': 1, 'identify': 1, 'previous': 1, 'detectives': 1, 'republican': 1, 'group': 1, 'monitor': 1, 'clashes': 1, 'civil': 1, 'charge': 1, 'breaches': 1, 'travelling': 1, 'main': 1, 'disrupt': 1, …

python file classification weka arff
How to optimize a sklearn pipeline, using XGboost, for a different `eval_metric`?

I'm trying to use XGBoost, and optimize the eval_metric as auc(as described here). This works fine when using …

python scikit-learn classification pipeline xgboost
What is the difference between sample weight and class weight options in scikit learn?

I have class imbalance problem and want to solve this using cost sensitive learning. under sample and over sample give …

python machine-learning scikit-learn classification
python: How to get real feature name from feature_importances

I am using Python's sklearn random forest (ensemble.RandomForestClassifier) to do classification and am using feature_importances_ to find significant …

python scikit-learn classification feature-selection
How to convert distance into probability?

Сan anyone shine a light to my matlab program? I have data from two sensors and i'm doing a kNN …

matlab classification knn euclidean-distance probability-density
Open Source Naïve Bayes Classifier written in Java

I'm looking for an Open Source Naïve Bayes Classifier library written in Java. Would appreciate any help in finding …

java open-source bayesian classification bayesian-networks
Dealing with the class imbalance in binary classification

Here's a brief description of my problem: I am working on a supervised learning task to train a binary classifier. …

python r machine-learning classification