Hierarchical Classification via Orthogonal Transfer
Abstract: We consider multiclass classification problems where the set of labels are organized hierarchically as a category tree. We associate each node in the tree with a classifier and classify the examples recursively from the root to the leaves. We propose a hierarchical Support Vector Machine (SVM) that encourages the classifier at each node of the tree to be different from the classifiers at its ancestors. More specifically, we introduce regularizations that force the normal vector of the classifying hyperplane at each node to be orthogonal to those at its ancestors as much as possible. We establish conditions under which training such a hierarchical SVM is a convex optimization problem, and develop an efficient dual-averaging method for solving it. We evaluate the method on a number of real-world text categorization tasks and obtain state-of-the-art performance.
Keywords: Hierarchical classification, transfer learning, convex optimization, dual averaging algorithms
Category 1: Applications -- Science and Engineering
Category 2: Convex and Nonsmooth Optimization
Citation: Microsoft Research Technical Report MSR-TR-2011-54. A short version of this paper (without proofs in the appendix) appearing in Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.
Entry Submitted: 05/02/2011
Modify/Update this entry
|Visitors||Authors||More about us||Links|
Search, Browse the Repository
Give us feedback
|Optimization Journals, Sites, Societies|