loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation
Exploring Java software vocabulary: A search and mining perspective
Vancouver, BC, Canada
May 16-May 16
ISBN: 978-1-4244-3740-5
Erik Linstead, School of Information and Computer Sciences. University of California, Irvine, USA
Lindsey Hughes, Department of Math and Computer Science. Chapman University, Orange, CA, USA
Cristina Lopes, School of Information and Computer Sciences. University of California, Irvine, USA
Pierre Baldi, School of Information and Computer Sciences. University of California, Irvine, USA
We conduct a large-scale analysis of Java source code vocabulary for 12,151 open source projects from Source-Forge and Apache, a corpus substantially larger than considered previously. Simple statistical analysis demonstrates robust power-law behavior for word count distributions across multiple program entities. We then identify salient vocabulary trends for classes, interfaces, methods, and fields. Our results provide low-level insight into the vocabulary space governing Java software development, with direct application to program comprehension and software search. Supplementary material may be found at: http://sourcerer.ics.uci.edu/suite2009/suite.html.
Citation:
Erik Linstead, Lindsey Hughes, Cristina Lopes, Pierre Baldi, "Exploring Java software vocabulary: A search and mining perspective," suite, pp.29-32, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation, 2009
Usage of this product signifies your acceptance of the Terms of Use.