-

 

 

 




Optimization Online





 

Strong Optimal Classification Trees

Sina Aghaei(saghaei***at***usc.edu)
Andrés Gómez(gomezand***at***usc.edu)
Phebe Vayanos(phebe.vayanos***at***usc.edu)

Abstract: Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning optimal binary classification trees. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed- integer optimization (MIO) technology. Yet, existing MIO-based approaches from the literature do not leverage the power of MIO to its full extent: they rely on weak formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose an intuitive flow-based MIO formulation for learning optimal binary classification trees. Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees. Moreover, we show that our formulation has a stronger linear optimization relaxation than existing methods. We exploit the decomposable structure of our formulation and max-flow/min-cut duality to derive a Benders’ decomposition method to speed-up computation. We propose a tailored procedure for solving each decomposed subproblem that provably generates facets of the feasible set of the MIO as constraints to add to the main problem. We conduct extensive computational experiments on standard benchmark datasets on which we show that our proposed approaches are 31 times faster than state-of-the art MIO-based techniques and improve out of sample performance by up to 8%.

Keywords: Optimal Classification Trees, Mixed-Integer Optimization, Benders' Decomposition, Machine Learning

Category 1: Integer Programming ((Mixed) Integer Linear Programming )

Category 2: Applications -- Science and Engineering (Statistics )

Category 3: Combinatorial Optimization

Citation: Technical Report, University of Southern California, January 2021

Download: [PDF]

Entry Submitted: 01/20/2021
Entry Accepted: 01/21/2021
Entry Last Modified: 01/20/2021

Modify/Update this entry


  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository

 

Submit
Update
Policies
Coordinator's Board
Classification Scheme
Credits
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society