loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04)
A Method for Evaluating the Quality of String Dissimilarity Measures and Clustering Algorithms for EST Clustering
Taichung, Taiwan, ROC
May 19-May 21
ISBN: 0-7695-2173-8
Judith Zimmermann, ETH Zurich
Zsuzsanna Lipt?, Universit?t Bielefeld, Germany
Scott Hazelhurst, University of the Witwatersrand, South Africa
We present a method for evaluating the suitability of different string dissimilarity measures and clustering algorithms for EST clustering, one of the main techniques used in transcriptome projects. The method comprises generating simulated ESTs with user-specified parameters, and then evaluating the quality of clusterings produced when different dissimilarity measures and different clustering algorithms are used. We implemented two tools to do this: ESTSim (EST Simulator), which generates simulated EST sequences from mRNAs/cDNAs using user-specified parameters, and ECLEST (Evaluator for CLusterings of ESTs), which computes and evaluates a clustering of a set of input ESTs, where the dissimilarity measure, the clustering algorithm, and the clustering validity index can be specified independently. We demonstrate the method on a sample of 699 cDNAs, generating approximately 16,000 simulated ESTs. We conducted two experiments and derived statistically significant results from this study comparing subword-based dissimilarity measures to alignment-based ones.
Index Terms:
string similarity and dissimilarity measures, EST clustering, transcriptome, simulated data, benchmarks
Citation:
Judith Zimmermann, Zsuzsanna Lipt?, Scott Hazelhurst, "A Method for Evaluating the Quality of String Dissimilarity Measures and Clustering Algorithms for EST Clustering," bibe, pp.301, Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.