1545-5963/12/$31.00 © 2012 IEEE
Published by the IEEE Computer Society
Guest Editors' Introduction to the Special Section on Bioinformatics Research and Applications
This special section includes a selection of papers presented at the Seventh International Symposium on Bioinformatics Research and Application (ISBRA), which was held at Central South University, Changsha, China, on 27-29 May 2011. The ISBRA symposium provides a forum for the exchange of ideas and results among researchers, developers, and practitioners working on all aspects of bioinformatics and computational biology and their applications. In 2011, 91 full papers were submitted in response to the call for papers, out of which 36 papers appeared in the ISBRA proceedings published as volume 6674 of Springer Verlag's Lecture Notes in Bioinformatics series.
A small number of authors were invited to submit extended versions of their symposium papers to this special section. Following a rigorous review process, seven papers were selected for publication. The first three selected papers are devoted to classic problems in phylogenetics, the fourth paper applies phylogenetics to refining regulatory networks, and the last three papers address different aspects of protein-protein ineteraction.
The Robinson-Foulds (RF) supertree problem asks for a supertree that is most similar to input trees, i.e., consistent with the maximum number of splits in the input trees. Although finding RF supertrees for rooted and unrooted data is NP-hard, effective local search heuristics exist when input and output trees are rooted. "Fast Local Search for Unrooted Robinson-Foulds Supertrees" by Ruchi Chaudhary, J. Gordon Burleigh, and David Fernández-Baca proposes new heuristics for unrooted case improving the quality of RF trees especially when the number of input trees is large.
Devoted to the study of metrics that measure the similarity of phylogenies, "A Metric for Phylogenetic Trees Based on Matching" by Yu Lin, Vaibhav Rajan, and Bernard Moret introduces a new distance measure based on matching for two phylogenetic trees. The new measure is motivated by the observation that existing measures are suffering from problems varying from computational efficiency to lack of robustness. The new measure can be viewed as a weighted extension of the widely used Robinson-Foulds distance. Theoretical analysis and statistical testing show that the new measure is on the space of trees, robust, and can be computed efficiently. Moreover, it is shown that the new measure does not exhibit unexpected behavior under the same inputs that cause problems to other measures. Applications of the measure in clustering trees are also described.
"The Kernel of Maximum Agreement Subtrees" by Krister M. Swenson, Eric Chen, Nicholas D. Pattengale, and David Sankoff introduces Kernel Agreement SubTree (KAST) which summarizes the common substructure in all maximum agreement subtrees (MAST's) and shows that KAST can be found in polynomial time for bounded degree trees. It is also shown that the size of the KAST is correlated with how the input trees are related to each other. The main advantage of KAST is that it is not as susceptible to rogue leaves as the very conservative strict consensus can be, and is not as misleading as a single MAST can be.
"Refining Regulatory Networks through PhylogeneticTransfer of Information" by Xiuwei Zhang and Bernard M.E. Moret describes ProPhyC, a probabilistic phylogenetic model designed to improve the inference of regulatory networks for a family of organisms by using the phylogenetic relationships among these organisms. The introduced model can be adapted to different network evolutionary models and is shown to be robust against changes in the network evolutionary models. Extensive experimental results on simulated and biological data confirm that the model with associated refinement algorithms yields substantial improvement in the quality of inferred networks over existing methods.
"Algorithms to Detect Multiprotein Modularity Conserved during Evolution" by Luqman Hodgkinson and Richard Karp studies algorithms for detecting multiprotein modularity conserved during evolution. The authors introduce a definition of modularity for interactomes, and develop a linear time algorithm for detecting modular regions that change infrequently during evolution. The algorithm improves on the running time of previous algorithms for related problems and offers desirable theoretical guarantees. It is also introduced a collection of biologically motivated evaluation measures sensitive to important issues not addressed by previous measures. The study of the evaluation measures leads to useful insights on the nature of interactomics data and on the goals of various algorithms.
"Predicting Protein Function by Multi-label Correlated Semisupervised" by Qiang Jiang and Lisa McQuay proposes a new graph-based semi-supervised learning algorithm to associate proteins with multiple functions simultaneously. The proposed algorithm is shown not to be affected by scarcity of labeled data as much as existing methods since it takes into account the intrinsic correlation between different functional classes. During each iteration, each protein receives the label information not only from neighbors that are annotated with the same class in the functional-linkage network, but also from partners labeled with other closely related classes. In the 10-fold cross validation on the yeast proteome compiled from BioGRID database, the proposed algorithm always achieves superior performances when compared with five state-of-the-art approaches.
Observing that most protein centrality measures are focused on topologies of individual proteins and ignore the relevance between interactions and protein essentiality, "Identification of Essential Proteins Based on EdgeClustering Coefficient" by Jianxin Wang, Min Li, Huan Wang, and Yi Pan proposes a new centrality measure for identifying essential proteins, which is based on edge clustering coefficient. The new measure considers the centrality of individual proteins as well as interaction between proteins, and takes into account modular natures of protein essentiality. Experimental results on protein-protein interaction networks show that the number of essential proteins discovered based on the new measure universally exceeds that discovered by other centrality measures. Moreover, the essential proteins discovered based on the new measure show significant cluster effect.
We would like to thank the Program Committee members and external reviewers for volunteering their time to review the submissions to the symposium and the special section. We would also like to thank the Editor-in-Chief, Dr. Marie-France Sagot, for continuing to provide us with the opportunity to showcase some of the exciting research presented at ISBRA in the IEEE/ACM Transactions on Computational Biology and Bioinformatics. Last, but not least, we would like to thank all ISBRA authors—the symposium could not continue to thrive without their high-quality contributions.
• J. Chen is with the Department of Computer Science, Texas A&M University, College Station, TX 77843-3112. E-mail: firstname.lastname@example.org.
• A. Zelikovsky is with the Computer Science Department, Georgia State University, Atlanta, GA 30303-4110. E-mail: email@example.com.
For information on obtaining reprints of this article, please send e-mail to: firstname.lastname@example.org.
received the BS degree in computer science from Central South University, P. R. China, and the MS and PhD degrees in computer science from Courant Institute of Mathematical Sciences, New York University. He then joined Columbia University, where he received MA, MPhil, and PhD degrees in mathematics. Currently, he is a professor of computer science and engineering at Texas A&M University. Dr. Chen has received many awards for his teaching and research in computer science and engineering, including the Janet Fabri Award for the Best PhD Dissertation in Courant Institute, New York University, and the US NSF Research Initiation Award. He was awarded the TEES Select Young Faculty Award, Amoco Faculty Award, Eugene E. Webb'43 Faculty Fellow Award, E.D. Brockett Professorship Award, two times for the AFS Distinguished Faculty Achievement Award at the college level, and the AFS Distinguished Faculty Achievement Award at the university level, all from Texas A&M University. He is currently serving on the editorial boards of the Journal of Computer and System Sciences
, the IEEE Transactions on Computers
, and Science in China: Information Sciences
. He has also served as a program committee member for many international conferences. He is a steering committee member for the International Symposium on Parameterized and Exact Computation (IPEC). He was a program committee cochair for the Sixth Annual Conference on Theory and Applications of Models of Computation (TAMC '09), the Fourth International Workshop on Parameterized and Exact Computation (IWPEC '09), and the Seventh International Symposium on Bioinformatics Research and Applications (ISBRA '11). His research interests include algorithms and computational optimization, bioinformatics, computer graphics, and computer networks. He has published extensively in these areas.
received the PhD degree in computer science from the Institute of Mathematics at the Belorussian Academy of Sciences in Minsk, Belarus, in 1989 and worked at the Institute of Mathematics in Kishinev, Moldova, from 1989 to 1995. Between 1992 and 1995, he visited Bonn University and the Institut fur Informatik in Saarbrueken, Germany. Dr. Zelikovsky was a research scientist at the University of Virginia from 1995 to 1997 and a postdoctoral scholar at UCLA from 1997 to 1998. He is a professor in the Computer Science Department at Georgia State University which he joined in 1999. His research interests include bioinformatics, discrete and approximation algorithms, combinatorial optimization, VLSI physical layout design, and ad-hoc wireless networks. He is the author of more than 170 refereed publications and co-editor of four books. Dr. Zelikovsky received the SIAM Outstanding Paper Prize and the best paper award at the joint Asia-South Pacific Design Automation/VLSI Design Conferences. He is the founding cochair of the Workshop on Computational Advances in Next Generation Sequencing, Workshop on Computational Advances in Molecular Epidemiology, and the International Symposium on Bioinformatics Research and Applications (ISBRA). He has also served on the editorial boards of six journals and a guest editor for numerous special issues including five in IEEE Transactions.