Rapid advances in bionanotechnology have recently generated growing interest in identifying peptides that bind to inorganic materials and classifying them based on their inorganic material affinities. However, there are some distinct characteristics of inorganic materials binding sequence data that limit the performance of many widely-used classification methods when applied to this problem. In this paper, we propose a novel framework to predict the affinity classes of peptide sequences with respect to an associated inorganic material. We first generate a large set of simulated peptide sequences based on an amino acid transition matrix tailored for the specific inorganic material. Then the probability of test sequences belonging to a specific affinity class is calculated by minimizing an objective function. In addition, the objective function is minimized through iterative propagation of probability estimates among sequences and sequence clusters. Results of computational experiments on two real inorganic material binding sequence datasets show that the proposed framework is highly effective for identifying the affinity classes of inorganic material binding sequences. Moreover, the experiments on the SCOP (structural classification of proteins) dataset shows that the proposed framework is general and can be applied to traditional protein sequences.
Aidong Zhang, "Identifying Affinity Classes of Inorganic Materials Binding Sequences via a Graph-based Model", IEEE/ACM Transactions on Computational Biology and Bioinformatics, , no. 1, pp. 1, PrePrints PrePrints, doi:10.1109/TCBB.2014.2321158