  


Learning a Mixture of Gaussians via Mixed Integer Optimization
Hari Bandi(hbandimit.edu) Abstract: We consider the problem of estimating the parameters of a multivariate Gaussian mixture model (GMM) given access to $n$ samples $\x_1,\x_2,\ldots ,\x_n \in\mathbb{R}^d$ that are believed to have come from a mixture of multiple subpopulations. Stateoftheart algorithms used to recover these parameters use heuristics to either maximize the loglikelihood of the sample or try to fit first few moments of the GMM to the sample moments. In contrast, we present here a novel Mixed Integer Optimization (MIO) formulation that optimally recovers the parameters of the GMM by minimizing a discrepancy measure (either the KolmogorovSmirnov or the Total variation distance) between the empirical distribution function and the distribution function of the GMM whenever the mixture component weights are known. We also present an algorithm for multidimensional data that optimally recovers corresponding means and covariance matrices. We show that the MIO approaches are practically solvable for datasets with $n$ in the tens of thousands in minutes and achieve an average improvement of 6070\% and 5060\% on mean absolute percentage error (MAPE) in estimating the means and the covariance matrices, respectively over the EM algorithm independent of the sample size $n$. As the separation of the Gaussians decrease and correspondingly the problem becomes more difficult the edge in performance in favor of the MIO methods widens. Finally, we also show that the MIO methods outperform the EM algorithm with an average improvement of 45\% on the outofsample accuracy for realworld datasets. Keywords: Gaussian Mixture Models, Mixed Integer Optimization. Category 1: Applications  OR and Management Sciences Category 2: Applications  Science and Engineering (Statistics ) Category 3: Integer Programming (01 Programming ) Citation: Download: [PDF] Entry Submitted: 10/15/2018 Modify/Update this entry  
Visitors  Authors  More about us  Links  
Subscribe, Unsubscribe Digest Archive Search, Browse the Repository

Submit Update Policies 
Coordinator's Board Classification Scheme Credits Give us feedback 
Optimization Journals, Sites, Societies  