Optimization Online


Best subset selection for eliminating multicollinearity

Ryuta Tamura(s154558y***at***st.go.tuat.ac.jp)
Ken Kobayashi(ken-kobayashi***at***jp.fujitsu.com)
Yuichi Takano(ytakano***at***isc.senshu-u.ac.jp)
Ryuhei Miyashiro(r-miya***at***cc.tuat.ac.jp)
Kazuhide Nakata(nakata.k.ac***at***m.titech.ac.jp)
Tomomi Matsui(matsui.t.af***at***m.titech.ac.jp)

Abstract: This paper proposes a method for eliminating multicollinearity from linear regression models. Specifically, we select the best subset of explanatory variables subject to the upper bound on the condition number of the correlation matrix of selected variables. We first develop a cutting plane algorithm that, to approximate the condition number constraint, iteratively appends valid inequalities to the mixed integer quadratic optimization problem. We also devise mixed integer semidefinite optimization formulations for best subset selection under the condition number constraint. Computational results demonstrate that our cutting plane algorithm frequently provides solutions of better quality than those obtained using local search algorithms for subset selection. Additionally, subset selection by means of our optimization formulations succeeds when the number of candidate explanatory variables is small.

Keywords: Optimization, statistics, subset selection, multicollinearity, linear regression, mixed integer semidefinite optimization

Category 1: Applications -- Science and Engineering

Category 2: Global Optimization

Category 3: Integer Programming


Download: [PDF]

Entry Submitted: 07/26/2016
Entry Accepted: 07/26/2016
Entry Last Modified: 07/26/2016

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society