CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2013 vol.10 Issue No.04 - July-Aug.

Subscribe

Issue No.04 - July-Aug. (2013 vol.10)

pp: 905-913

Nan Liu , Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

Haitao Jiang , Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

Daming Zhu , Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China

Binhai Zhu , Dept. of Comput. Sci., Montana State Univ., Bozeman, MT, USA

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.100

ABSTRACT

Scaffold filling is a new combinatorial optimization problem in genome sequencing. The one-sided scaffold filling problem can be described as given an incomplete genome I and a complete (reference) genome G, fill the missing genes into I such that the number of common (string) adjacencies between the resulting genome I' and G is maximized. This problem is NP-complete for genome with duplicated genes and the best known approximation factor is 1.33, which uses a greedy strategy. In this paper, we prove a better lower bound of the optimal solution, and devise a new algorithm by exploiting the maximum matching method and a local improvement technique, which improves the approximation factor to 1.25. For genome with gene repetitions, this is the only known NP-complete problem which admits an approximation with a small constant factor (less than 1.5).

INDEX TERMS

Bioinformatics, Genomics, Approximation methods, Approximation algorithms, Educational institutions, Algorithm design and analysis, Sequential analysis,algorithms, Comparative genomics, scaffold filling, breakpoints, adjacencies, NP-completeness

CITATION

Nan Liu, Haitao Jiang, Daming Zhu, Binhai Zhu, "An Improved Approximation Algorithm for Scaffold Filling to Maximize the Common Adjacencies",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.10, no. 4, pp. 905-913, July-Aug. 2013, doi:10.1109/TCBB.2013.100REFERENCES