Test and Training Set are Not Compatible

Dark Knight picture Dark Knight · Nov 5, 2013 · Viewed 7.2k times · Source

I have seen various articles about the same issue, Tried a lot of solutions and nothing is working. Kindly advice.

I am getting an error in WEKA:

"Problem Evaluating Classifier: Test and Training Set are Not Compatible".

I am using

J48 as my algorithm

This is my Test set:

Trainset:
https://www.dropbox.com/s/fm0n1vkwc4yj8yn/train.csv

Evalset:
https://www.dropbox.com/s/2j9jgxnoxr8xjdx/Eval.csv

(I am unable to copy and paste due to long code)

I have tried "Batch Filtering" in WEKA (for Traningset) but it still does not work.

EDIT: I have even converted my .csv to .arff but still the same issue.

EDIT2: I have made sure the headers in both CSV's match. Even then same issue. Please help!

Please advice.

Answer

Walter picture Walter · Nov 5, 2013

A common error in converting ".csv" files to ".arff" with Weka is when values for nominal attributes appear in a different order or not at all from dataset to dataset.

Your evaluation ".arff" file probably looks like this (skipping irrelevant data):

@relation Eval
@attribute a321 {TRUE}

Your train ".arff" file probably looks like this (skipping irrelevant data):

@relation train
@attribute a321 {FALSE}

However, both should contain all possible values for that attribute, and in the same order:

@attribute a321 {TRUE, FALSE}

You can remedy this by post-processing your ".arff" files in a text editor and changing the header so that your nominal values appear in the same order (and quantity) from file to file.