loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Conference on Pattern Recognition (ICPR'06) Volume 3
Scalable Representative Instance Selection and Ranking
Hong Kong
August 20-August 24
ISBN: 0-7695-2521-0
Xingquan Zhu, University of Vermont, VT
Xindong Wu, University of Vermont, VT
Finding a small set of representative instances for large datasets can bring various benefits to data mining practitioners so they can (1) build a learner superior to the one constructed from the whole massive data; and (2) avoid working on the whole original dataset all the time. We propose in this paper a Scalable Representative Instance Selection And Ranking (SRISTAR pronounced 3STAR) mechanism, which carries two unique features: (1) it provides a representative instance ranking list, so that users can always select instances from the top to the bottom, based on the number of examples they prefer; and (2) it investigates the behaviors of the underlying examples for instance selection, and the selection procedure tries to optimize the expected future error. Given a dataset, we first cluster instances into small data cells, each of which consists of instances with similar behaviors. Then we progressively evaluate data cells and their combinations, and order them into a list such that the learners built from the top cells are more accurate.
Citation:
Xingquan Zhu, Xindong Wu, "Scalable Representative Instance Selection and Ranking," icpr, vol. 3, pp.352-355, 18th International Conference on Pattern Recognition (ICPR'06) Volume 3, 2006
Usage of this product signifies your acceptance of the Terms of Use.