How is feature importance calculated for GradientBoostingClassifier

D.W. picture D.W. · May 24, 2017 · Viewed 7.6k times · Source

I'm using scikit-learn's gradient-boosted trees classifier, GradientBoostingClassifier. It makes feature importance score available in feature_importances_. How are these feature importances calculated?

I'd like to understand what algorithm scikit-learn is using, to help me understand how to interpret those numbers. The algorithm isn't listed in the documentation.

Answer

D.W. picture D.W. · May 24, 2017

This is documented elsewhere in the scikit-learn documentation. In particular, here is how it works:

For each tree, we calculate the feature importance of a feature F as the fraction of samples that will traverse a node that splits based on feature F (see here). Then, we average those numbers across all trees (as described here).

It is not described exactly how scikit-learn estimates the fraction of nodes that will traverse a tree node that splits on feature F.

The interpretation: scores will be in the range [0,1]. Higher scores mean the feature is more important. This is an array with shape (n_features,) whose values are positive and sum to 1.0