| - | ||||
|
|
Dimensionality Reduction for Classification - Comparison of Techniques and Dimension Choice
Frank Plastria (Frank.Plastria Abstract: Dimensionality reduction is an important issue nowadays in classification problems, since high-dimensional data sets are analyzed via sophisticated classification algorithms whose running times may be dramatically affected by the dimensionality of the data. We investigate the effects of dimensionality reduction using different techniques and different dimensions on two-class data sets as pre-processing for two classification algorithms. Besides reducing the dimensionality with the use of principal components and linear discriminants, we also introduce four new techniques. After this dimensionality reduction two algorithms are applied. The first algorithm takes advantage of the reduced dimensionality itself while the second one directly exploits the dimensional ranking. We show on six two-class data sets with numerical attributes that by effectively executing this pre-processing, we can make these algorithms generate classifiers that can rival industry standards. The choice of the dimensionality has a significant impact. On the one hand, results show that it is worthwhile not to choose a fixed dimensionality without considering the data. On the other hand, more importantly we also observe that common approaches based on the residual variance that dissociate the data and the classification algorithm to determine the dimensionality may be bad estimators if the goal is to maximize the classification power. Keywords: Dimensionality Reduction, Dimension Choice, Classification, Principal Components, PCA, Linear Discriminants, Fisher, LDA, Principal Separation Components, PSC, Mean Components, PMC, LMD, PMSC, Optimal Distance Separating Hyperplane, ODSH, Eigenvalue-based Classification Tree, EVCT Category 1: Applications -- Science and Engineering (Data-Mining ) Citation: This paper is going to be published in Lecture Notes in Artificial Intelligence by Springer Download: [PDF] Entry Submitted: 04/23/2008 Modify/Update this entry | ||
| Visitors | Authors | More about us | Links | |
|
Subscribe, Unsubscribe Digest Archive Search, Browse the Repository
|
Submit Update Policies |
Coordinator's Board Classification Scheme Credits Give us feedback |
Optimization Journals, Sites, Societies | |
|
||||