Classification of CpG Islands in the Human Genome Based on the Interval Distance Distribution of Adjacent CG Sites
Computer Science and Information Engineering, World Congress on (2009)
Los Angeles, California USA
Mar. 31, 2009 to Apr. 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSIE.2009.822
There have been many studies analyzing relations between CpG islands and gene functions. Most results showed that promoters of many housekeeping genes contain CpG islands, however, the relation between gene functions and CG dinucleotides positions in CpG islands was less considered. In this study, we try to classify CpG islands according to interval distance distribution of adjacent CG sites and find some functional correlations. First the human genome sequences were downloaded from the EMBL Nucleotide Sequence Database. Then a dataset was constructed, each record of which is an interval distance distribution of adjacent CG sites of a CpG island. Finally an algorithm was designed, which can calculate approximately minimal difference of any two records. Based on the algorithm, we obtained many classes using the hierarchical clustering method, each of which contains some similar CpG islands, and some of their common features were studied.
J. Du, X. Wu, L. Liu, B. Wang and C. Qi, "Classification of CpG Islands in the Human Genome Based on the Interval Distance Distribution of Adjacent CG Sites," 2009 WRI World Congress on Computer Science and Information Engineering, CSIE(CSIE), Los Angeles, CA, 2009, pp. 246-249.