Option predictive clustering trees for multi-target regression

Tomaž Stepišnik1, 2, Aljaž Osojnik1, 2, Sašo Džeroski1, 2 and Dragi Kocev1, 2

  1. Department of Knowledge Technologies, Jožef Stefan Institute
    Ljubljana, Slovenia
  2. Jožef Stefan International Postgraduate School
    Ljubljana, Slovenia
    {tomaz.stepisnik,aljaz.osojnik,saso.dzeroski,dragi.kocev}@ijs.si

Abstract

Decision trees are one of the most widely used predictive modelling methods primarily because they are readily interpretable and fast to learn. These nice properties come at the price of predictive performance. Moreover, the standard induction of decision trees suffers from myopia: a single split is chosen in each internal node which is selected in a greedy manner; hence, the resulting tree may be sub-optimal. To address these issues, option trees have been proposed which can include several alternative splits in a new type of internal nodes called option nodes. Considering all of this, an option tree can be also regarded as a condensed representation of an ensemble. In this work, we propose to learn option trees for multi-target regression (MTR) based on the predictive clustering framework. The resulting models are thus called option predictive clustering trees (OPCTs). Multi-target regression is concerned with learning predictive models for tasks with multiple numeric target variables. We evaluate the proposed OPCTs on 11 benchmark MTR data sets. The results reveal that OPCTs achieve statistically significantly better predictive performance than a single predictive clustering tree (PCT) and are competitive with bagging and random forests of PCTs. By limiting the number of option nodes, we can achieve a good trade-off between predictive power and efficiency (model size and learning time). We also perform parameter sensitivity analysis and bias-variance decomposition of the mean squared error. Our analysis shows that OPCTs can reduce the variance of PCTs nearly as much as ensemble methods do. In terms of bias, OPCTs occasionally outperform other methods. Finally, we demonstrate the potential of OPCTs for multifaceted interpretability and illustrate the potential for inclusion of domain knowledge in the tree learning process.

Key words

multi-target regression, option trees, interpretable models, predictive clustering trees, bias-variance decomposition of error

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS190928006S

Publication information

Volume 17, Issue 2 (June 2020)
Year of Publication: 2020
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Stepišnik, T., Osojnik, A., Džeroski, S., Kocev, D.: Option predictive clustering trees for multi-target regression. Computer Science and Information Systems, Vol. 17, No. 2, 459–486. (2020), https://doi.org/10.2298/CSIS190928006S