Optimization Online


Exact solution of the donor-limited nearest neighbor hot deck imputation problem

Jan Pablo Burgard(burgardj***at***uni-trier.de)
Sven de Vries(devries***at***uni-trier.de)
Ulf Friedrich(ulf.friedrich***at***tum.de)
Dennis Kreber(kreberd***at***uni-trier.de)

Abstract: Data quality in population surveys suffers from missing responses. We use combinatorial optimization to create a complete and coherent data set. The methods are based on the widespread nearest neighbor hot deck imputation method that replaces the missing values with observed values from a close unit, the so-called donor. As a repeated use of donors may lead to distortions in the statistics, an additional constraint on the maximum number of donor re-uses is introduced. It is shown how this problem can be modeled as a maximum weighted b-matching problem. Based on this theoretical analysis, we implement a cost scaling, network simplex and successive shortest path algorithm for computing a globally optimal solution. These outperform the available and presently used implementations in terms of solution quality and running time. Statistical imputation is presented as a novel application of Operations Research, the available methodology is improved by a concise theoretical treatment, and fast implementations are provided.

Keywords: Survey Data, Combinatorial Optimization, b-Matching, Missing Data, Nearest Neighbor Hot Deck Imputation

Category 1: Combinatorial Optimization (Graphs and Matroids )

Category 2: Applications -- Science and Engineering (Statistics )

Category 3: Network Optimization


Download: [PDF]

Entry Submitted: 08/20/2019
Entry Accepted: 08/20/2019
Entry Last Modified: 08/20/2019

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society