The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January-February (2011 vol.8)
pp: 80-93
Răzvan Andonie , Central Washington University, Ellensburg and Transylvania University of Braşov, Romania
Levente Fabry-Asztalos , Central Washington University, Ellensburg
Christopher Badi' Abdul-Wahid , Central Washington University, Ellensburg
Sarah Abdul-Wahid , Central Washington University, Ellensburg
Grant I. Barker , Central Washington University, Ellensburg
Lukas C. Magill , Central Washington University, Ellensburg
ABSTRACT
Obtaining satisfactory results with neural networks depends on the availability of large data samples. The use of small training sets generally reduces performance. Most classical Quantitative Structure-Activity Relationship (QSAR) studies for a specific enzyme system have been performed on small data sets. We focus on the neuro-fuzzy prediction of biological activities of HIV-1 protease inhibitory compounds when inferring from small training sets. We propose two computational intelligence prediction techniques which are suitable for small training sets, at the expense of some computational overhead. Both techniques are based on the FAMR model. The FAMR is a Fuzzy ARTMAP (FAM) incremental learning system used for classification and probability estimation. During the learning phase, each sample pair is assigned a relevance factor proportional to the importance of that pair. The two proposed algorithms in this paper are: 1) The GA-FAMR algorithm, which is new, consists of two stages: a) During the first stage, we use a genetic algorithm (GA) to optimize the relevances assigned to the training data. This improves the generalization capability of the FAMR. b) In the second stage, we use the optimized relevances to train the FAMR. 2) The Ordered FAMR is derived from a known algorithm. Instead of optimizing relevances, it optimizes the order of data presentation using the algorithm of Dagher et al. In our experiments, we compare these two algorithms with an algorithm not based on the FAM, the FS-GA-FNN introduced in . We conclude that when inferring from small training sets, both techniques are efficient, in terms of generalization capability and execution time. The computational overhead introduced is compensated by better accuracy. Finally, the proposed techniques are used to predict the biological activities of newly designed potential HIV-1 protease inhibitors.
INDEX TERMS
Fuzzy neural networks, evolutionary computing and genetic algorithms, computational chemistry, data mining.
CITATION
Răzvan Andonie, Levente Fabry-Asztalos, Christopher Badi' Abdul-Wahid, Sarah Abdul-Wahid, Grant I. Barker, Lukas C. Magill, "Fuzzy ARTMAP Prediction of Biological Activities for Potential HIV-1 Protease Inhibitors Using a Small Molecular Data Set", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 1, pp. 80-93, January-February 2011, doi:10.1109/TCBB.2009.50
REFERENCES
[1] R. Andonie and L. Sasu, "Fuzzy ARTMAP with Input Relevances," IEEE Trans. Neural Networks, vol. 17, no. 4 pp. 929-941, July 2006.
[2] I. Dagher, M. Georgiopoulos, G. Heileman, and G. Bebis, "Ordered Fuzzy ARTMAP: A Fuzzy ARTMAP Algorithm with a Fixed Order of Pattern Presentation," Proc. IEEE Int'l Joint Conf. Neural Networks, pp. 1717-1722, May 1998.
[3] I. Dagher, M. Georgiopoulos, G.L. Heileman, and G. Bebis, "An Ordering Algorithm for Pattern Presentation in Fuzzy ARTMAP that Tends to Improve Generalization Performance," IEEE Trans. Neural Networks, vol. 10, no. 4, pp. 768-778, July 1999.
[4] R. Andonie, L. Fabry-Asztalos, S. Abdul-Wahid, C. Collar, and N. Salim, "An Integrated Soft Computing Approach for Predicting Biological Activity of Potential HIV-1 Protease Inhibitors," Proc. IEEE Int'l Joint Conf. Neural Networks (IJCNN '06), pp. 7495-7502, July 2006.
[5] L. Fabry-Asztalos, R. Andonie, C. Collar, S. Abdul-Wahid, and N. Salim, "A Genetic Algorithm Optimized Fuzzy Neural Network Analysis of the Affinity of Inhibitors for HIV-1 Protease," Bioorganic and Medicinal Chemistry, vol. 16, pp. 2903-2911, 2008.
[6] D. Weekes and G.B. Fogel, "Evolutionary Optimization, Backpropagation, and Data Preparation Issues in QSAR Modeling of HIV Inhibition by HEPT Derivatives," BioSystems, vol. 72, pp. 149-158, 2003.
[7] R. Andonie, L. Fabry-Asztalos, C. Collar, S. Abdul-Wahid, and N. Salim, "Neuro-Fuzzy Prediction of Biological Activity and Rule Extraction for HIV-1 Protease Inhibitors," Proc. IEEE Symp. Computational Intelligence in Bioinformatics and Computational Biology (CIBCB '05), pp. 113-120, 2005.
[8] I.V. Tetko, V.Y. Tanchuk, and A.I. Luik, "Evaluation of Potential HIV-1 Reverse Transcriptase Inhibitors by Artificial Neural Networks," Proc. Seventh Ann. IEEE Symp. Computer-Based Medical Systems, pp. 311-316, 1994.
[9] I.V. Tetko and V.Y. Tanchuk, "Application of Associative Neural Networks for Prediction of Lipophilicity in ALOPS 2.1 Program," J. Chemical Information and Computer Sciences, vol. 42, pp. 1136-1145, 2002.
[10] T. Niwa, "Using General Regression and Probabilistic Neural Networks to Predict Human Intestinal Absorption with Topological Descriptors Derived from Two-Dimensional Chemical Structures," J. Chemical Information and Computer Sciences, vol. 43, pp. 113-119, 2003.
[11] Z.R. Yang and R. Thomson, "Bio-Basis Function Neural Network for Prediction of Protease Cleavage Sites in Proteins," IEEE Trans. Neural Networks, vol. 16, no. 1, pp. 263-274, Jan. 2005.
[12] I.V. Tetko, A.I. Luik, and G.I. Poda, "Application of Neural Networks in Structure-Activity Relationships of a Small Number of Molecules," J. Medicinal Chemistry, vol. 36, pp. 811-814, 1993.
[13] J. Devillers, "Designing Molecules with Specific Properties from Intercommunicating Hybrid Systems," J. Chemical Information and Computer Sciences, vol. 36, pp. 1061-1066, 1996.
[14] T. Niwa, "Prediction of Biological Targets Using Probabilistic Neural Networks and Atom-Type Descriptors," J. Medicinal Chemistry, vol. 47, pp. 2645-2650, 2004.
[15] P. Potocnik, I. Grabec, M. Setinc, and J. Levec, "Hybrid Modeling of Kinetics for Methanol Synthesis," Soft Computing Approaches in Chemistry, H. Cartwright and L.M. Sztandera, eds., Springer-Verlag, 2000.
[16] A.M. Bianucci, A. Micheli, A. Sperduti, and A. Starita, "Application of Cascade Correlation Networks for Structures to Chemistry," Applied Intelligence, vol. 12, pp. 117-147, 2000.
[17] X.J. Yao, A. Panaye, J.P. Doucet, R.S. Zhang, H.F. Chen, M.C. Liu, Z.D. Hu, and B.T. Fan, "Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression," J. Chemical Information and Computer Sciences, vol. 44, pp. 1257-1266, 2004.
[18] S. Draghici and R.B. Potter, "Predicting HIV Drug Resistance with Neural Networks," Bioinformatics, vol. 19, pp. 98-107, 2003.
[19] J. Paetz, "Evolutionary Optimization of Interval Rules for Drug Design," Proc. 2004 IEEE Symp. Computational Intelligence in Bioinformatics and Computational Biology (CIBC '04), pp. 238-243, 2004.
[20] D. Yaffe, Y. Cohen, G. Espinosa, A. Arenas, and F. Giralt, "Fuzzy ARTMAP and Back-Propagation Neural Networks Based Quantitative Structure-Property (QSPRs) for Octanol-Water Partition Coefficient of Organic Compounds," J. Chemical Information and Computer Sciences, vol. 42, pp. 162-183, 2002.
[21] G. Espinosa, D. Yaffe, A. Arenas, Y. Cohen, and F. Giralt, "A Fuzzy ARTMAP-Based Qualitative Structure-Property Relationship (QSPR) for Predicting Physical Properties of Organic Compounds," Industrial & Eng. Chemistry Research, vol. 40, pp. 2757-2766, 2001.
[22] G. Espinosa, A. Arenas, and F. Giralt, "An Integrated Som-Fuzzy ARTMAP Neural System for the Evaluation of Toxicity," J. Chemical Information and Computer Sciences, vol. 42, pp. 343-359, 2002.
[23] A.M. Bianucci, A. Micheli, A. Sperduti, and A. Starita, "A Novel Approach to QSPR/QSAR Based on Neural Networks for Structures," Soft Computing Approaches in Chemistry, H. Cartwright and L.M. Sztandera, eds., Springer-Verlag, 2000.
[24] A. Micheli, F. Portera, and A. Sperduti, "A Preliminary Experimental Comparison of Recursive Neural Networks and a Tree Kernel Methods on Regression Tasks for Tree Structured Domains," Neurocomputing, vol. 64, pp. 73-92, 2005.
[25] G.A. Carpenter, S. Grossberg, N. Markuzon, J.H. Reynolds, and D.B. Rosen, "Fuzzy ARTMAP: A Neural Network Architecture for Incremental Supervised Learning of Analog Multidimensional Maps," IEEE Trans. Neural Networks, vol. 3, no. 5, pp. 698-713, Sept. 1992.
[26] S. Verzi, G. Heileman, M. Georgiopoulos, and M.J. Healy, "Boosted ARTMAP," Proc. IEEE World Congress Computational Intelligence (WCCI '98), pp. 396-400, 1998.
[27] E. Granger, P. Henniges, L.S. Oliveira, and R. Sabourin, "Particle Swarm Optimization of Fuzzy ARTMAP Parameters," Proc. IEEE Int'l Joint Conf. Neural Networks (IJCNN '06), pp. 4062-4069, July 2006.
[28] A. Al Daraiseh, M. Georgiopoulos, G. Anagnostopoulos, A.S. Wu, and M. Mollaghasemi, "GFAM: A Genetic Algorithm Optimization of Fuzzy ARTMAP," Proc. IEEE Int'l Joint Conf. Neural Networks (IJCNN '06), pp. 1391-1398, July 2006.
[29] S. Tan, M. Rao, and C.P. Lim, "A Hybrid Neural Network Classifier Combining Ordered Fuzzy ARTMAP and the Dynamic Decay Adjustment Algorithm," Soft Computing, vol. 12, pp. 765-775, 2008.
[30] A. Koufakou, M. Georgiopoulos, G. Anagnostopoulos, and T. Kasparis, "Cross-Validation in Fuzzy ARTMAP for Large Databases," Neural Networks, vol. 14, pp. 1279-1291, 2001.
[31] P. Henniges, E. Granger, and R. Sabourin, "Factors of Overtraining with Fuzzy ARTMAP Neural Networks," Proc. IEEE Int'l Joint Conf. Neural Networks (IJCNN '05), pp. 1075-1080, Aug. 2005.
[32] R. Andonie, L. Fabry-Asztalos, L. Magill, and S. Abdul-Wahid, "A New Fuzzy ARTMAP Approach for Predicting Biological Activity of Potential HIV-1 Protease Inhibitors," Proc. IEEE Int'l Conf. Bioinformatics and Biomedicine (BIBM '07), pp. 56-61, 2007.
[33] V. Vapnik, Statistical Learning Theory. Wiley, 2000.
[34] J.-L. Yuan and T. Fine, "Neural-Network Design for Small Training Sets of High Dimension," IEEE Trans. Neural Networks, vol. 9, no. 2, pp. 266-280, Mar. 1998.
[35] J.-L. Yuan, "Bootstrapping Nonparametric Feature Selection Algorithms for Mining Small Data Sets," Proc. Int'l Joint Conf. Neural Networks (IJCNN), pp. 2526-2529, 1999.
[36] R. Mao, H. Zhu, L. Zhang, and A. Chen, "A New Method to Assist Small Data Set Neural Network Learning," Proc. Sixth Int'l Conf. Intelligent Systems Design and Applications (ISDA '06), pp. 17-22, 2006.
[37] D.-C. Li, C.-S. Wu, T.-I. Tsai, and Y.-S. Lina, "Using Mega-Trend-Diffusion and Artificial Samples in Small Data Set Learning for Early Flexible Manufacturing System Scheduling Knowledge," Computers and Operations Research, vol. 34, pp. 966-982, 2007.
[38] D.-C. Li, C.-W. Yeh, T.I. Tsai, Y.-H. Fang, and S. Hu, "Acquiring Knowledge with Limited Experience," Expert Systems, vol. 24, pp. 162-170, 2007.
[39] D. Hecht and G. Fogel, "High-Throughput Ligand Screening via Preclustering and Evolved Neural Networks," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 3, pp. 476-484, July-Sept. 2007.
[40] S. Mars Cheung Johnson, D. Hecht, and G.B. Fogel, "Quantitative Structure-Property Relationships for Drug Solubility Prediction Using Evolved Neural Networks," Proc. IEEE World Congress on Computational Intelligence, pp. 688-693, 2008.
[41] A.K. Debnath, "Comparative Molecular Field Analysis (CoMFA) of a Series of Symmetrical Bis-Benzamide Cyclic Urea Derivatives as HIV-1 Protease Inhibitors," J. Chemical Information and Computer Sciences, vol. 38, pp. 761-767, 1998.
[42] A.C. Nair, P. Jayatilleke, X. Wang, S. Miertus, and W.J. Welsh, "Computational Studies on Tetrahydropyrimidine-2-One HIV-1 Protease Inhibitors: Improving Three-Dimensional Quantitative Structure-Activity Relationship Comparative Molecular Field Analysis Models by Inclusion of Calculated Inhibitor- and Receptor-Based Properties," J. Medicinal Chemistry, vol. 45, pp. 973-983, 2002.
[43] Y. Peng, S. Keenan, Q. Zhang, V. Kholodovych, and W. Welsh, "3D-QSAR Comparative Molecular Field Analysis on Opioid Receptor Antagonists: Pooling Data from Different Studies," J. Medicinal Chemistry, vol. 48, pp. 1620-1629, 2005.
[44] A. Wlodawer, "Structure-Based Inhibitors of HIV-1 Protease," Ann. Rev. Biochemistry, vol. 62, pp. 543-585, 1993.
[45] A. Wlodawer and J. Vondrasek, "Inhibitors of HIV-1 Protease: A Major Success of Structure-Assisted Drug Design," Ann. Rev. Biophysics and Biomolecular Structure, vol. 27, pp. 249-284, 1998.
[46] D. Leung, G. Abbenante, and D.P. Fairlie, "Protease Inhibitors: Current Status and Future Prospects," J. Medicinal Chemistry, vol. 43, pp. 305-341, 2000.
[47] D.H. Rich, "Comprehensive Medicinal Chemistry," Chemical Rev., vol. 2, pp. 391-441, 1990.
[48] F.E. Boyer, J.V. Vara Prasad, J.M. Domagala, E.L. Ellsworth, C. Gajda, S.E. Hagen, L.J. Markoski, B.D. Tait, E.A. Lunney, A. Palovsky, D. Ferguson, N. Graham, T. Holler, D. Hupe, C. Nouhan, P.J. Tummino, A. Urumov, E. Zeikus, G. Zeikus, S.J. Gracheck, J.M. Sanders, S. VanderRoest, J. Brodfuehrer, K. Iyer, M. Sinz, and S.V. Gulnik, "5,6-Dihydropyran-2-Ones Possessing Various Sulfonyl Functionalities: Potent Nonpeptidic Inhibitors of HIV Protease," J. Medicinal Chemistry, vol. 43, pp. 843-858, 2000.
[49] M.P. Glenn, L.K. Pattenden, R.C. Reid, D.P. Tyssen, J.D. Tyndall, C.J. Birch, and D.P. Fairlie, "Beta-Strand Mimicking Macrocyclic Amino Acids: Templates for Protease Inhibitors with Antiviral Activity," J. Medicinal Chemistry, vol. 45, pp. 371-381, 2002.
[50] D. Scholz, A. Billich, B. Charpiot, P. Ettmayer, P. Lehr, B. Rosenwirth, E. Schreiner, and H. Gstach, "Inhibitors of HIV-1 Proteinase Containing 2-Heterosubstituted 4-Amino-3-Hydroxy-5-Phenylpentanoic Acid: Synthesis, Enzyme Inhibition, and Antiviral Activity," J. Medicinal Chemistry, vol. 37, pp. 3079-3089, 1994.
[51] S.E. Hagen, J.V. Prasad, F.E. Boyer, J.M. Domagala, E.L. Ellsworth, C. Gajda, H.W. Hamilton, L.J. Markoski, B.A. Steinbaugh, B.D. Tait, E.A. Lunney, P.J. Tummino, D. Ferguson, D. Hupe, C. Nouhan, S.J. Gracheck, J.M. Saunders, and S. VanderRoest, "Synthesis of 5,6-Dihydro-4-Hydroxy-2-Pyrones as HIV-1 Protease Inhibitors: The Profound Effect of Polarity on Antiviral Activity," J. Medicinal Chemistry, vol. 40, pp. 3707-3711, 1997.
[52] C.J. Collar, "Molecular Modeling of Four Aspartic Protease Enzymes and Their Inhibitors," master's thesis, Central Washington Univ., June 2006.
[53] S.E. Hagen, J. Domagala, C. Gajda, M. Lovdahl, B.D. Tait, E. Wise, T. Holler, D. Hupe, C. Nouhan, A. Urumov, G. Zeikus, E. Zeikus, E.A. Lunney, A. Pavlovsky, S.J. Gracheck, J. Saunders, S. VanderRoest, and J. Brodfuehrer, "4-Hydroxy-5,6-Dihydropyrones as Inhibitors of HIV Protease: The Effect of Heterocyclic Substituents at c-6 on Antiviral Potency and Pharmacokinetic Parameters," J. Medicinal Chemistry, vol. 44, pp. 2319-2332, 2001.
[54] B.D. Tait, S. Hagen, J. Domagala, E.L. Ellsworth, C. Gagda, H.W. Hamilton, J.V.N.V. Prasad, D. Ferguson, N. Graham, D. Hupe, C. Nouhan, P.J. Tummino, C. Humblet, E.A. Lunney, A. Pavlovsky, J. Rubin, S.J. Gracheck, E.T. Baldwin, T.N. Bhat, J.W. Erickson, S.V. Gulnik, and B. Liu, "4-Hydroxy-5,6-Dihydropyrones. 2. Potent Non-Peptide Inhibitors of HIV Protease," J. Medicinal Chemistry, vol. 40, pp. 3781-3792, 1997.
[55] R.J. Hyndman and A.B. Hoehler, "Another Look at Measures of Forecast Accuracy," Technical Report 13/05, Dept. of Econometrics and Business Statistics, Monash Univ., May 2005.
[56] S. Makridakis, "Accuracy Measures: Theoretical and Practical Concerns," Int'l J. Forecasting, vol. 9, pp. 527-529, 1993.
[57] T.T. Tanimoto, "An Elementary Mathematical Theory of Classification and Prediction," technical report, IBM, 1958.
[58] R. Guha, M.T. Howard, G.R. Hutchison, P. Murray-Rust, H. Rzepa, C. Steinbeck, J.K. Wegner, and E.L. Willighagen, "The Blue Obelisk-Interoperability in Chemical Informatics," J. Chemical Information and Modeling, vol. 46, pp. 991-998, 2006.
[59] M. Taghi, V. Baghmisheh, and P. Nikola, "A Fast Simplified Fuzzy ARTMAP Network," Neural Processing Letters, vol. 17, no. 3, pp. 273-316, 2003.
[60] C.P. Lim and R.F. Harrison, "ART-Based Autonomous Learning Systems: Part I—Architectures and Algorithms," Innovations in ART Neural Networks, L.C. Jain, B. Lazzerini, and U. Halici, eds. Springer, 2000.
[61] J. Tou and R. Gonzales, Pattern Recognition Principles. Addison-Wesley, 1976.
[62] D.M. Hawkins, "The Problem of Overfitting," J. Chemical Information and Computer Sciences, vol. 44, pp. 1-12, 2004.
[63] T.G. Dietterich, "Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms," Neural Computation, vol. 10, pp. 1895-1923, 1998.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool