Issue No. 05 - September/October (2011 vol. 8)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.31
William W.L. Wong , University of Toronto, Toronto
Forbes J. Burkowski , University of Waterloo, Waterloo
Quantitative structure-activity relationships (QSARs) correlate biological activities of chemical compounds with their physicochemical descriptors. By modeling the observed relationship seen between molecular descriptors and their corresponding biological activities, we may predict the behavior of other molecules with similar descriptors. In QSAR studies, it has been shown that the quality of the prediction model strongly depends on the selected features within molecular descriptors. Thus, methods capable of automatic selection of relevant features are very desirable. In this paper, we present a new feature selection algorithm for a QSAR study based on kernel alignment which has been used as a measure of similarity between two kernel functions. In our algorithm, we deploy kernel alignment as an evaluation tool, using recursive feature elimination to compute a molecular descriptor containing the most important features needed for a classification application. Empirical results show that the algorithm works well for the computation of descriptors for various applications involving different QSAR data sets. The prediction accuracies are substantially increased and are comparable to those from earlier studies.
Feature selection, kernel alignment, quantitative structure-activity relationship (QSAR).
F. J. Burkowski and W. W. Wong, "Using Kernel Alignment to Select Features of Molecular Descriptors in a QSAR Study," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. , pp. 1373-1384, 2011.