  


A mixedinteger optimization approach to an exhaustive crossvalidated model selection for regression
Dennis Kreber (kreberdunitrier.de) Abstract: We consider a linear regression model for which we assume that many of the observed regressors are irrelevant for the prediction. To avoid overfitting, we conduct a variable selection and only include the true predictors for the least square fitting. The best subset selection gained much interest in recent years for addressing this objective. For this method, a mixedinteger optimization problem is solved, which finds the optimal subset not larger than a given natural number k concerning the insample error. In practice, a best subset selection is computed for each k, and the ideal k is then chosen via a validation. We argue that the notion of the best subset selection might be misaligned with the statistical intention. Instead, we propose a subset selection formulation based on the crossvalidation loss function. We present a discrete optimization formulation which fits coefficients to training data and decides to in or exclude variables to minimize the crossvalidation error. Hence, we do not require a fixed sparsity bound and do not have to solve successive discrete optimization problems. Moreover, we present bounds for the regression coefficients, which allows us to construct a tighter mixedinteger formulation. Finally, we conduct a simulation study and provide evidence that the novel mixedinteger formulation provides excellent predictions surpassing the results of competing stateoftheart approaches. Keywords: Best Subset Selection, Sparse Regression, CrossValidation, MixedInteger Quadratic Programming, Bilevel Optimization Category 1: Applications  Science and Engineering (Statistics ) Category 2: Integer Programming ((Mixed) Integer Nonlinear Programming ) Citation: Download: [PDF] Entry Submitted: 05/03/2019 Modify/Update this entry  
Visitors  Authors  More about us  Links  
Subscribe, Unsubscribe Digest Archive Search, Browse the Repository

Submit Update Policies 
Coordinator's Board Classification Scheme Credits Give us feedback 
Optimization Journals, Sites, Societies  