Ensembles of Oblique Decision Trees
Ensemble methods such as bagging are widely used as they can improve generalization performance and stability compared to individual base estimators. Such methods often use decision trees as base estimators. Standard decision-tree algorithms learn univariate splits in decision nodes resulting in axis-parallel decision boundaries, which can lead to very deep decision trees owing to the limited representation power within each decision node. Oblique decision trees, which have seen recent renewed interest, learn multivariate linear splits in decision nodes. Oblique decision trees are generally shallower than their axis-parallel counterparts owing to the increased representative power of multivariate splits. This thesis analyzes the performance of different oblique decision tree algorithms when used as a base estimator in a bagging ensemble. In particular, the thesis explores the trade-offs between increased node complexity (oblique splits vs. axis-parallel splits) and tree complexity (shallower oblique trees vs. deeper axis-parallel trees), and their effect on ensemble performance. Bagging ensembles of several state-of-the-art oblique tree algorithms are compared with standard bagging approaches on different data sets. This analysis highlights two key results: (1) randomization is a powerful and efficient technique for tree-learning for ensemble learning owing to its ability to promote ensemble diversity by decreasing estimator correlation; (2) for larger problems, with more features or a large number of classes, optimization-based oblique decision tree ensembles are effective but at the expense of a greater computational cost.