Empirical comparison of tree ensemble variable importance measures
Access Status
Authors
Date
2011Type
Metadata
Show full item recordCitation
Source Title
ISSN
School
Collection
Abstract
Tree ensembles are becoming well-established as popular and powerful data modelling techniques. Tree ensemble models are essentially black box models, although their individual members may not be, and with their growing popularity, interest in the interpretation of tree ensemble models has also grown. This study presents variable importance measures associated with random forests, conditional inference forests and boosted trees, and employs a number of simulated data sets to compare these methods. Overall, variable importance indicators based on bagged conditional inference forests appear to strike a good balance between identification of significant variables and avoiding unnecessary flagging of correlated variables. Data preprocessing and interpretation by experts knowledgeable with a specific data set remain vital.
Related items
Showing items related by title, author, creator and subject.
-
Chow, Chi Ngok (2010)The largest wool exporter in the world is Australia, where wool being a major export is worth over AUD $2 billion per year and constitutes about 17 per cent of all agricultural exports. Most Australian wool is sold by ...
-
Aldrich, Chris; Auret, L. (2010)The ever-present drive to safer, more cost-effective and cleaner processes motivates the exploration of a variety of process monitoring methods. In the domain of data-driven approaches, random forest models present a ...
-
Auret, L.; Aldrich, Chris (2010)Process monitoring technology plays a vital role in the automation of mineral processing plants, where there is an increased emphasis on safe, cost-effective, and environmentally responsible operation. Members of an ...