Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04) A Fast Protein Structure Retrieval System Using Image-Based Distance Matrices and Multidimensional Index Taichung, Taiwan, ROC May 19-May 21 ISBN: 0-7695-2173-8
Indexing protein structures has been shown to provide a scalable solution for structure-to-structure comparisons in large protein structure retrieval systems. To conduct similarity searches against 46,075 polypeptide chains in a database with real-time responses, two critical issues must be addressed, information extraction and suitable indexing. In this paper, we apply computer vision techniques to extract the predominant information encoded in each 2D distance matrix, generated from 3D coordinates of protein chains. Distance matrices are capable of representing specific protein structural topologies, and similar proteins will generate similar matrices. Once meaningful features are extracted from distance images, an advanced indexing structure, Entropy Balanced Statistical (EBS) k-d tree, can be utilized to index the multidimensional data. With a limited amount of training data from domain experts, namely structural classification of a subset of available protein chains, we apply various techniques in the pattern recognition field to determine clusters of proteins in the multi-dimensional feature space. Our system is able to recall search results in a ranked order from the protein database in seconds, exhibiting a reasonably high degree of precision.
Citation:
Pin-Hao Chi, Grant Scott, Chi-Ren Shyu, "A Fast Protein Structure Retrieval System Using Image-Based Distance Matrices and Multidimensional Index," bibe, pp.522, Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04), 2004 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||