.arff files with scikit-learn?

tumbleweed picture tumbleweed · Dec 3, 2014 · Viewed 20.5k times · Source

I would like to use an Attribute-Relation File Format with scikit-learn to do some NLP task, is this possible? How can use an .arff file with scikit-learn?

Answer

renatopp picture renatopp · Dec 4, 2014

I really recommend liac-arff. It doesn't load directly to numpy, but the conversion is simple:

import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])