Determines the importance of features by measuring the decrease in impurity
caused by each feature used, weighted by the amount of data seen by the node
using the feature.
This method only works for classification datasets as it uses the
ImpurityScore
class, but may use any impurity measure supported.
For more info, see:
- Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013).
Understanding variable importances in forests of randomized trees. In
C. j. c. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. q. Weinberger
(Eds.), Advances in Neural Information Processing Systems 26 (pp. 431–439).
Retrieved from
here
- Breiman, L. (2002). Manual on setting up, using, and understanding random
forests v3.1. Statistics Department University of California Berkeley, CA,
USA.