The Community for Technology Leaders

Guest Editors' Introduction to the Special Section on Bioinformatics Research and Applications

Ion Mandoiu
Yi Pan
Raj Sunderraman
Alexander Zelikovsky

Pages: pp. 178-179

This special section includes a selection of papers presented at the Fourth International Symposium on Bioinformatics Research and Application (ISBRA), which was held at Georgia State University in Atlanta, Georgia, on 6-9 May 2008. The ISBRA symposium provides a forum for the exchange of ideas and results among researchers, developers, and practitioners working on all aspects of bioinformatics and computational biology and their applications. In 2008, 94 papers were submitted in response to the call for papers, out of which 35 papers appeared in the ISBRA proceedings published as volume 4983 of Springer Verlag's Lecture Notes in Bioinformatics series.

A small number of authors were invited to submit extended versions of their symposium papers to this special section. Following a rigorous review process, five papers were selected for publication. The selected papers cover a broad range of bioinformatics topics, including multiple local sequence alignment methods, computational prediction of siRNA silencing efficacy, gene network models, microarray data analysis and inference, and reconstruction and analysis of phylogenetic trees.

The first paper by Treangen et al. presents a novel approach to identify interspersed repeats in genome sequences. Existing methods perform pairwise local sequence alignments to identify homologues, but these methods are not scalable and have limited accuracy. The method proposed in the paper uses a clever combination of a gapped extension heuristic and an efficient filtration technique to achieve greater accuracy in the identification of interspersed repeats. The proposed method is implemented and made available for download.

In the second paper, Qiu and Lane adapt the Support Vector Regression approach by considering multiple kernel functions to effectively predict siRNA silencing efficacy. Computational prediction of the initiator siRNA molecules can be of tremendous assistance to the scientist in the screening process before using them in biological experiments. The authors formulate the multiple kernel learning function into a quadratically constrained quadratic programming problem, provide several heuristics, and empirically establish the superiority of their approach over current methods in accuracy, model complexity, and computational speed.

In the third paper, Park et al. employ gene network models in a novel manner to analyze microarray data to infer cancer progression. This approach considerably improves the estimates of evolutionary distance between tumors over methods that employ only gene expression profiles. They also present three variants of the gene network model approach: one that uses optimized best-fit networks, the second that uses sampling to infer high confidence subnetworks, and the third that uses modular networks inferred from clusters of similarly expressed genes. The three variants show excellent results on lung cancer and breast cancer microarray data.

The last two papers are devoted to advanced methods for the reconstruction and analysis of phylogenetic trees. The paper by Zhu et al. proposes a new way to define and analyze gene clusters and gene order. They show that the bandwidth parameter of a graph is tightly connected with the proposed parameterized definition of gene clusters and affects the number, size, and extent of preservation of identified clusters along phylogenetic trees. The latter property is computed using a new dynamic programming algorithm. The advantages of the proposed analysis methods are illustrated by application to a set of genomes drawn from the Yeast Gene Order Browser.

The paper by Bansal et al. is devoted to the problem of inferring a species supertree by reconciling gene trees, including those constructed for large families of duplicated genes, based on the duplication optimality criterion. The resulting optimization problem (commonly referred to as the gene-duplication problem) is NP-hard and practical solutions are frequently based on local search heuristics. In each step, these heuristics must find a phylogenetic tree that is optimal under the duplication optimality criterion in the neighborhood of the current tree, i.e., the set of trees that can be obtained from it by applying a variety of tree edit operations. The authors propose near-linear time algorithms for searching optimal trees within neighborhoods defined by the $k$ -NNI (Nearest Neighbor Interchange) tree edit operation for $k\in\{1,2,3\}$ . They validate their algorithms using sets of large randomly generated gene trees.

We would like to thank the Program Committee members and external reviewers for volunteering their time to review the submissions to the symposium and the special section. We would also like to thank former Editor-in-Chief, Professor Dan Gusfield, as well as the current Editor-in-Chief, Dr. Marie-France Sagot, for continuing to provide us with the opportunity to showcase some of the exciting research presented at ISBRA in the IEEE/ACM Transactions on Computational Biology and Bioinformatics. Last but not least, we would like to thank all ISBRA authors—the symposium could not continue to thrive without their high-quality contributions.

Ion Mandoiu Yi Pan Raj Sunderraman Alexander Zelikovsky Guest Editors