Issue No. 04 - July-Aug. (2013 vol. 10)

ISSN: 1545-5963

pp: 905-913

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.100

Nan Liu , Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

Haitao Jiang , Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

Daming Zhu , Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

Binhai Zhu , Dept. of Comput. Sci., Montana State Univ., Bozeman, MT, USA

ABSTRACT

Scaffold filling is a new combinatorial optimization problem in genome sequencing. The one-sided scaffold filling problem can be described as given an incomplete genome I and a complete (reference) genome G, fill the missing genes into I such that the number of common (string) adjacencies between the resulting genome I' and G is maximized. This problem is NP-complete for genome with duplicated genes and the best known approximation factor is 1.33, which uses a greedy strategy. In this paper, we prove a better lower bound of the optimal solution, and devise a new algorithm by exploiting the maximum matching method and a local improvement technique, which improves the approximation factor to 1.25. For genome with gene repetitions, this is the only known NP-complete problem which admits an approximation with a small constant factor (less than 1.5).

INDEX TERMS

Bioinformatics, Genomics, Approximation methods, Approximation algorithms, Educational institutions, Algorithm design and analysis, Sequential analysis,algorithms, Comparative genomics, scaffold filling, breakpoints, adjacencies, NP-completeness

CITATION

Nan Liu, Haitao Jiang, Daming Zhu, Binhai Zhu, "An Improved Approximation Algorithm for Scaffold Filling to Maximize the Common Adjacencies",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol. 10, no. , pp. 905-913, July-Aug. 2013, doi:10.1109/TCBB.2013.100