loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)
Sparse Logistic Classifiers for Interpretable Protien Homology Detection
Hong Kong, China
December 18-December 22
ISBN: 0-7695-2702-7
Pai-Hsi Huang, Rutgers University
Vladimir Pavlovic, Rutgers University
Computational classification of proteins using methods such as string kernels and Fisher-SVM has demonstrated great success. However, the resulting models do not offer an immediate interpretation of the underlying biological mechanisms. In particulal; some recent studies have postulated the existence of a small subset of positions and residues in protein sequences may be suficient to discriminate among different protein classes. In this work, we propose a hybrid setting for the classiJication task. A generative model is trained as a feature extractor, followed by a sparse classifier in the extracted feature space to determine the membership of the sequence, while discovering features relevant for classification. The set of sparse biologically motivated features together with the discriminative method offer the desired biological interpretability. We apply the proposed method to a widely used dataset and show that the peqormance of our models is comparable to that of the state-ofthe- art methods. The resulting models use fewer than 10% of the original features. At the same time, the sets of critical features discovered by the model appear to be consistent with confinned biological findings.
Citation:
Pai-Hsi Huang, Vladimir Pavlovic, "Sparse Logistic Classifiers for Interpretable Protien Homology Detection," icdmw, pp.99-103, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.