Optimization Online


Subset Selection by Mallows' Cp: A Mixed Integer Programming Approach

Ryuhei Miyashiro (r-miya***at***cc.tuat.ac.jp)
Yuichi Takano (takano.y.ad***at***m.titech.ac.jp)

Abstract: This paper concerns a method of selecting the best subset of explanatory variables for a linear regression model. Employing Mallows' C_p as a goodness-of-fit measure, we formulate the subset selection problem as a mixed integer quadratic programming problem. Computational results demonstrate that our method provides the best subset of variables in a few seconds when the number of candidate explanatory variables is less than 30. Furthermore, when handling datasets consisting of a large number of samples, it finds better-quality solutions faster than stepwise regression methods do.

Keywords: Subset selection, Mixed integer programming, Mallows' C_p, Linear regression model

Category 1: Integer Programming ((Mixed) Integer Nonlinear Programming )

Category 2: Nonlinear Optimization (Quadratic Programming )

Category 3: Applications -- Science and Engineering (Statistics )

Citation: Published in Expert Systems with Applications, 42 (2015), 325-331.


Entry Submitted: 01/18/2014
Entry Accepted: 01/18/2014
Entry Last Modified: 09/04/2014

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society