2013 10th Working Conference on Mining Software Repositories (MSR) (2013)
San Francisco, CA, USA USA
May 18, 2013 to May 19, 2013
Xin Xia , College of Computer Science and Technology, Zhejiang University
David Lo , School of Information Systems, Singapore Management University
Xinyu Wang , College of Computer Science and Technology, Zhejiang University
Bo Zhou , College of Computer Science and Technology, Zhejiang University
Nowadays, software engineers use a variety of online media to search and become informed of new and interesting technologies, and to learn from and help one another. We refer to these kinds of online media which help software engineers improve their performance in software development, maintenance and test processes as software information sites. It is common to see tags in software information sites and many sites allow users to tag various objects with their own words. Users increasingly use tags to describe the most important features of their posted contents or projects. In this paper, we propose TagCombine, an automatic tag recommendation method which analyzes objects in software information sites. TagCombine has 3 different components: 1. multilabel ranking component which considers tag recommendation as a multi-label learning problem; 2. similarity based ranking component which recommends tags from similar objects; 3. tag-term based ranking component which considers the relationship between different terms and tags, and recommends tags after analyzing the terms in the objects. We evaluate TagCombine on 2 software information sites, StackOverflow and Freecode, which contain 47,668 and 39,231 text documents, respectively, and 437 and 243 tags, respectively. Experiment results show that for StackOverflow, our TagCombine achieves recall@5 and recall@10 scores of 0.5964 and 0.7239, respectively; For Freecode, it achieves recall@5 and recall@10 scores of 0.6391 and 0.7773, respectively. Moreover, averaging over StackOverflow and Freecode results, we improve TagRec proposed by Al-Kofahi et al. by 22.65% and 14.95%, and the tag recommendation method proposed by Zangerle et al. by 18.5% and 7.35% for recall@5 and recall@10 scores.
Software, Media, Software algorithms, Vectors, Prediction algorithms, Search problems, Educational institutions
X. Xia, D. Lo, X. Wang and B. Zhou, "Tag recommendation in software information sites," 2013 10th IEEE Working Conference on Mining Software Repositories (MSR 2013)(MSR), San Francisco, CA, USA, 2013, pp. 287-296.