-

 

 

 




Optimization Online





 

Optimal expected-distance separating halfspace

Emilio Carrizosa (ecarrizosa***at***us.es)
Frank Plastria (Frank.Plastria***at***vub.ac.be)

Abstract: One recently proposed criterion to separate two datasets in discriminant analysis, is to use a hyperplane which minimises the sum of distances to it from all the misclassified data points. Here all distances are supposed to be measured by way of some fixed norm, while misclassification means lying on the wrong side of the hyperplane, or rather in the wrong halfspace. In this paper we study the problem of determining such an optimal halfspace when points are distributed according to an arbitrary random vector $X$ in $R^d,$. In the unconstrained case in dimension $d$, we prove that any optimal separating halfspace always balances the misclassified points. Moreover, under polyhedrality assumptions on the support of $X$, there always exists an optimal separating halfspace passing through $d$ affinely independent points. It follows that the problem is polynomially solvable in fixed dimension by an algorithm of $O(n^{d+1})$ when the support of $X$ consists of $n$ points. All these results are strengthened in the one-dimensional case, yielding an algorithm with complexity linear in the cardinality of the support of $X.$ If a different norm is used for each data set in order to measure distances to the hyperplane, or if all distances are measured by a fixed gauge, the balancing property still holds, and we show that, under polyhedrality assumptions on the support of $X,$ there always exists an optimal separating halfspace passing through $d-1$ affinely independent data points. These results extend in a natural way when we allow constraints modeling that certain points are forced to be correctly classified.

Keywords: norm-distance to hyperplane, separating halfspace, discriminant analysis

Category 1: Applications -- Science and Engineering (Data-Mining )

Category 2: Global Optimization (Applications )

Category 3: Applications -- Science and Engineering (Statistics )

Citation: Working paper MOSI/7, march 2004, 26p, Vrije Universiteit Brussel.

Download: [Compressed Postscript][PDF]

Entry Submitted: 10/05/2004
Entry Accepted: 10/05/2004
Entry Last Modified: 10/05/2004

Modify/Update this entry


  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository

 

Submit
Update
Policies
Coordinator's Board
Classification Scheme
Credits
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Programming Society