2013 IEEE 13th International Conference on Data Mining Workshops (2006)
Hong Kong, China
Dec. 18, 2006 to Dec. 22, 2006
ISBN: 0-7695-2702-7
pp: 142-146
Amit Saple , Indiana University
Kwangmin choi , Indiana University
Sun Kim , Indiana University
As more genomes become available, comparing multiple genomes is a very important research method in molecular biology which is useful not only for finding common features in different genomes, but also for understanding evolutionary process and mechanism among multiple genomes. However, it is challenging to develop a system for genome comparison since there are so many computational tools and databases which can be combined in numerous ways. In this paper, we discuss a bioinformatics system design principle by defining and using the genome data type and developing novel data mining algorithms based on the genome data type. Although many systems explored novel approaches for managing workflows among tools and databases in a flexible way, our approach of using the genome data type is unique among existing approaches. In particular, our approach is user-centric as domains and co-domains of the genome data type are what users are looking for. In addition, incorporation of novel data mining algorithms defined on the genome data type makes the genome comparison tasks simple and manageable, even on the web.
