Optimization Online


Strongly Agree or Strongly Disagree?: Rating Features in Support Vector Machines

Emilio Carrizosa (ecarrizosa***at***us.es)
Amaya Nogales-Gomez (amayanogales***at***us.es)
Dolores Romero Morales (dolores.romero-morales***at***sbs.ox.ac.uk)

Abstract: In linear classifiers, such as the Support Vector Machine (SVM), a score is associated with each feature and objects are assigned to classes based on the linear combination of the scores and the values of the features. Inspired by discrete psychometric scales, which measure the extent to which a factor is in agreement with a statement, we propose the Discrete Level Support Vector Machine (DILSVM) where the feature scores can only take on a discrete number of values, defined by the so-called feature rating levels. The DILSVM classifier benefits from interpretability as it can be seen as a collection of Likert scales, one for each feature, where we rate the level of agreement with the positive class. To build the DILSVM classifier, we propose a Mixed Integer Linear Programming approach, as well as a collection of strategies to reduce the building times. Our computational experience shows that the 3-point and the 5-point DILSVM classifiers have comparable accuracy to the SVM with a substantial gain in interpretability and sparsity, thanks to the appropriate choice of the feature rating levels.

Keywords: Support Vector Machines, Mixed Integer Linear Programming, Likert scale, interpretability, feature rating level

Category 1: Integer Programming ((Mixed) Integer Linear Programming )

Category 2: Applications -- Science and Engineering (Data-Mining )


Download: [PDF]

Entry Submitted: 10/15/2013
Entry Accepted: 10/16/2013
Entry Last Modified: 06/20/2014

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society