Decision Tree Learning and Impurity

machine-learning data-mining random-forest decision-tree

Jony · Feb 8, 2011 · Viewed 7k times · Source

There are three ways to measure impurity:

Entropy

Gini Index

Classification Error

What are the differences and appropriate use cases for each method?

Answer

If the p_i's are very small, then doing multiplication on very small numbers (Gini index) can lead to rounding error. Because of that, it is better to add the logs (Entropy). Classification error, following your definition, provides a gross estimate since it uses the single largest p_i to compute its value.

Decision Tree Learning and Impurity

Answer

Related questions