Get feature and class names into decision tree using export graphviz

sokeefe1014 picture sokeefe1014 · Sep 13, 2016 · Viewed 12.7k times · Source

Good Afternoon,

I am working on a decision tree classifier and am having trouble visualizing it. I can output the decision tree, however I cannot get my feature or class names/labels into it. My data is in a pandas dataframe format which I then move into a numpy array and pass to the classifier. I've tried a few things, but just seem to error out on the export when I try and specify class names. Any help would be appreciated. Code is below.

all_inputs=df.ix[:,14:].values
all_classes=df['wic'].values

(training_inputs,
 testing_inputs,
 training_classes,
 testing_classes) = train_test_split(all_inputs, all_classes,train_size=0.75, random_state=1)

decision_tree_classifier=DecisionTreeClassifier()
decision_tree_classifier.fit(training_inputs,training_classes)

export_graphviz(decision_tree_classifier, out_file="mytree.dot",  
                     feature_names=??,  
                     class_names=??)  

LIke I said, it runs fine and outputs a decision tree viz if I take out the feature_names and class_names parameters. I'd like to include them in the output though if possible and have hit a wall...

Any help would be greatly appreciated!

Thanks,

Scott

Answer

maxymoo picture maxymoo · Sep 14, 2016

The class names are stored in decision_tree_classifier.classes_, i.e. the classes_ attribute of your DecisionTreeClassifier instance. And the feature names should be the columns of your input dataframe. For your case you will have

class_names = decision_tree_classifier.classes_
feature_names = df.columns[14:]