2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation Exploring Java software vocabulary: A search and mining perspective Vancouver, BC, Canada May 16-May 16 ISBN: 978-1-4244-3740-5
We conduct a large-scale analysis of Java source code vocabulary for 12,151 open source projects from Source-Forge and Apache, a corpus substantially larger than considered previously. Simple statistical analysis demonstrates robust power-law behavior for word count distributions across multiple program entities. We then identify salient vocabulary trends for classes, interfaces, methods, and fields. Our results provide low-level insight into the vocabulary space governing Java software development, with direct application to program comprehension and software search. Supplementary material may be found at: http://sourcerer.ics.uci.edu/suite2009/suite.html.
Citation:
Erik Linstead, Lindsey Hughes, Cristina Lopes, Pierre Baldi, "Exploring Java software vocabulary: A search and mining perspective," suite, pp.29-32, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation, 2009 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||