I trained and created a J48 model using WEKA gui. I saved the model file to my computer and now I would like to use it to classify a single instance in my Java code. I would like to get a prediction for the attribute "cluster". What I do is the following:
public void classify(double lat, double lon, double co)
{
// Create attributes to be used with classifiers
Attribute latitude = new Attribute("latitude");
Attribute longitude = new Attribute("longitude");
Attribute carbonmonoxide = new Attribute("co");
// Create instances for each pollutant with attribute values latitude, longitude and pollutant itself
inst_co = new DenseInstance(4);
// Set instance's values for the attributes "latitude", "longitude", and "pollutant concentration"
inst_co.setValue(latitude, lat);
inst_co.setValue(longitude, lon);
inst_co.setValue(carbonmonoxide, co);
inst_co.setMissing(cluster);
Classifier cls_co = (Classifier) weka.core.SerializationHelper.read("/CO_J48Model.model");//load classifier from file
// Test the model
double result = cls_co.classifyInstance(inst_co);
}
However, I get an IndexArrayOutofBoundsException on the line inst_co.setValue(latitude, lat);
. I couldn't find the reason for this exception. I will appreciate if someone could point me in the right direction.
You need to add your inst_co to your data set, an Instances object. Following code should work.
import java.util.ArrayList;
import weka.classifiers.Classifier;
import weka.core.Attribute;
import weka.core.DenseInstance;
import weka.core.Instance;
import weka.core.Instances;
public class QuestionInstanceClassifiy {
public static void main(String[] args) {
QuestionInstanceClassifiy q = new QuestionInstanceClassifiy();
double result = q.classify(1.0d, 1, 1);
System.out.println(result);
}
private Instance inst_co;
public double classify(double lat, double lon, double co) {
// Create attributes to be used with classifiers
// Test the model
double result = -1;
try {
ArrayList<Attribute> attributeList = new ArrayList<Attribute>(2);
Attribute latitude = new Attribute("latitude");
Attribute longitude = new Attribute("longitude");
Attribute carbonmonoxide = new Attribute("co");
ArrayList<String> classVal = new ArrayList<String>();
classVal.add("ClassA");
classVal.add("ClassB");
attributeList.add(latitude);
attributeList.add(longitude);
attributeList.add(carbonmonoxide);
attributeList.add(new Attribute("@@class@@",classVal));
Instances data = new Instances("TestInstances",attributeList,0);
// Create instances for each pollutant with attribute values latitude,
// longitude and pollutant itself
inst_co = new DenseInstance(data.numAttributes());
data.add(inst_co);
// Set instance's values for the attributes "latitude", "longitude", and
// "pollutant concentration"
inst_co.setValue(latitude, lat);
inst_co.setValue(longitude, lon);
inst_co.setValue(carbonmonoxide, co);
// inst_co.setMissing(cluster);
// load classifier from file
Classifier cls_co = (Classifier) weka.core.SerializationHelper
.read("/CO_J48Model.model");
result = cls_co.classifyInstance(inst_co);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return result;
}
}
You create data object from Instances. Add your instance to this data. After that you can set your values in Instance.
Instances data = new Instances("TestInstances",attributeList,0);
inst_co = new DenseInstance(data.numAttributes());
data.add(inst_co);
I suggest getting header information and Instances values from external file or creating this information only once.