Option predictive clustering trees for multi-target regression
- Department of Knowledge Technologies, Jožef Stefan Institute
Ljubljana, Slovenia - Jožef Stefan International Postgraduate School
Ljubljana, Slovenia
{tomaz.stepisnik,aljaz.osojnik,saso.dzeroski,dragi.kocev}@ijs.si
Abstract
Decision trees are one of the most widely used predictive modelling methods primarily because they are readily interpretable and fast to learn. These nice properties come at the price of predictive performance. Moreover, the standard induction of decision trees suffers from myopia: a single split is chosen in each internal node which is selected in a greedy manner; hence, the resulting tree may be sub-optimal. To address these issues, option trees have been proposed which can include several alternative splits in a new type of internal nodes called option nodes. Considering all of this, an option tree can be also regarded as a condensed representation of an ensemble. In this work, we propose to learn option trees for multi-target regression (MTR) based on the predictive clustering framework. The resulting models are thus called option predictive clustering trees (OPCTs). Multi-target regression is concerned with learning predictive models for tasks with multiple numeric target variables. We evaluate the proposed OPCTs on 11 benchmark MTR data sets. The results reveal that OPCTs achieve statistically significantly better predictive performance than a single predictive clustering tree (PCT) and are competitive with bagging and random forests of PCTs. By limiting the number of option nodes, we can achieve a good trade-off between predictive power and efficiency (model size and learning time). We also perform parameter sensitivity analysis and bias-variance decomposition of the mean squared error. Our analysis shows that OPCTs can reduce the variance of PCTs nearly as much as ensemble methods do. In terms of bias, OPCTs occasionally outperform other methods. Finally, we demonstrate the potential of OPCTs for multifaceted interpretability and illustrate the potential for inclusion of domain knowledge in the tree learning process.
Key words
multi-target regression, option trees, interpretable models, predictive clustering trees, bias-variance decomposition of error
Digital Object Identifier (DOI)
https://doi.org/10.2298/CSIS190928006S
Publication information
Volume 17, Issue 2 (June 2020)
Year of Publication: 2020
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium
Full text
Available in PDF
Portable Document Format
How to cite
Stepišnik, T., Osojnik, A., Džeroski, S., Kocev, D.: Option predictive clustering trees for multi-target regression. Computer Science and Information Systems, Vol. 17, No. 2, 459–486. (2020), https://doi.org/10.2298/CSIS190928006S