This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Normalized Feature Vectors: A Novel Alignment-Free Sequence Comparison Method Based on the Numbers of Adjacent Amino Acids
March-April 2013 (vol. 10 no. 2)
pp. 457-467
De-Shuang Huang, Tongji University, Shanghai
Hong-Jie Yu, Anhui Science and Technology University, Fengyang
Based on all kinds of adjacent amino acids (AAA), we map each protein primary sequence into a 400 by ($(L-1)$) matrix $({\schmi M})$. In addition, we further derive a normalized 400-tuple mathematical descriptors $({\schmi D})$, which is extracted from the primary protein sequences via singular values decomposition (SVD) of the matrix. The obtained 400-D normalized feature vectors (NFVs) further facilitate our quantitative analysis of protein sequences. Using the normalized representation of the primary protein sequences, we analyze the similarity for different sequences upon two data sets: 1) ND5 sequences from nine species and 2) transferrin sequences of 24 vertebrates. We also compared the results in this study with those from other related works. These two experiments illustrate that our proposed NFV-AAA approach does perform well in the field of similarity analysis of sequence.
Index Terms:
Proteins,Amino acids,Vectors,Feature extraction,Bioinformatics,Educational institutions,alignment free,Proteins,Amino acids,Vectors,Feature extraction,Bioinformatics,Educational institutions,similarity analysis,Adjacent amino acids,normalized feature vector,singular value decomposition (SVD)
Citation:
De-Shuang Huang, Hong-Jie Yu, "Normalized Feature Vectors: A Novel Alignment-Free Sequence Comparison Method Based on the Numbers of Adjacent Amino Acids," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 2, pp. 457-467, March-April 2013, doi:10.1109/TCBB.2013.10
Usage of this product signifies your acceptance of the Terms of Use.