I need help to interpret result in weka using the J48
I dont know how to explain the result, I am using the dataset Heart Disease Data Set from http://archive.ics.uci.edu/ml/datasets/Heart+Disease
And the J48 tree
Please help me, with some points importants for this analyse my result is:
=== Run information ===
=== Classifier model (full training set) ===
cp <= 3
| sex <= 0: 0 (57.0/2.0)
| sex > 0
| | slope <= 1
| | | fbs <= 0
| | | | trestbps <= 152
| | | | | thalach <= 162
| | | | | | ca <= 1
| | | | | | | age <= 56: 0 (12.0/1.0)
| | | | | | | age > 56: 1 (3.0/1.0)
| | | | | | ca > 1: 1 (2.0)
| | | | | thalach > 162: 0 (27.0)
| | | | trestbps > 152: 1 (4.0/1.0)
| | | fbs > 0: 0 (9.0)
| | slope > 1
| | | slope <= 2
| | | | ca <= 0
| | | | | fbs <= 0
| | | | | | chol <= 261
| | | | | | | oldpeak <= 2.5: 0 (11.61/1.0)
| | | | | | | oldpeak > 2.5: 1 (3.0)
| | | | | | chol > 261: 1 (4.0)
| | | | | fbs > 0: 0 (4.0)
| | | | ca > 0
| | | | | thal <= 6: 1 (6.0/1.0)
| | | | | thal > 6
| | | | | | thalach <= 145: 0 (3.39)
| | | | | | thalach > 145: 1 (5.0/1.0)
| | | slope > 2: 0 (8.0/1.0)
cp > 3
| thal <= 3
| | ca <= 2
| | | exang <= 0
| | | | sex <= 0
| | | | | chol <= 304: 0 (14.0)
| | | | | chol > 304: 1 (3.0/1.0)
| | | | sex > 0
| | | | | ca <= 0: 0 (10.0/1.0)
| | | | | ca > 0: 1 (3.0)
| | | exang > 0
| | | | restecg <= 1
| | | | | slope <= 1: 0 (2.0)
| | | | | slope > 1: 1 (5.37)
| | | | restecg > 1
| | | | | ca <= 0: 0 (4.0)
| | | | | ca > 0
| | | | | | ca <= 1
| | | | | | | thalach <= 113: 0 (2.0)
| | | | | | | thalach > 113: 1 (4.0)
| | | | | | ca > 1: 0 (2.0)
| | ca > 2: 1 (4.0)
| thal > 3
| | fbs <= 0
| | | ca <= 0
| | | | chol <= 278: 0 (23.0/8.0)
| | | | chol > 278: 1 (6.0)
| | | ca > 0: 1 (46.0/12.0)
| | fbs > 0
| | | ca <= 1: 1 (3.88)
| | | ca > 1: 0 (11.75/4.75)
Number of Leaves : 31
Size of the tree : 61
If you are using Weka Explorer, you can right click on the result row in the results list (located on the left of the window under the start button). Then select visualize tree. This will display an image of the tree.
If you still want to understand the results as they are shown in your question:
The results are displayed as tree. The root of the tree starts at the left and the first feature used is called cp. If cp is smaller or equal to 3, then the next feature in the tree is sex and so on. You can see that when you split by sex and sex <= 0 you reach a prediction. The prediction is 0 and the (57/2) means that 57 observations in the training set end up at this path and 2 were incorrectly classified, i.e. 55 had the label 0 and 2 had the label 1.
Here is how the start of the tree looks like:
--------start---------
| |
| |
|cp > 3 | cp <= 3
_________|______ ____|__________
| | | |
|thal>3 |thal<=3 |sex>0 |sex<=0
| | | |
... ... ... prediction 0 57(55,2)