Search For:

Displaying 1-10 out of 10 total
Particle Mixed Membership Stochastic Block Model
Found in: 2012 Eighth International Conference on Semantics, Knowledge and Grids (SKG)
By Yan Zhang,Qixia Jiang,Maosong Sun
Issue Date:October 2012
pp. 88-95
Massive real-world data are network-structured, such as semantic web, social network, relationship between proteins, etc. Modeling a network is an effective way for better understanding the properties of a network, while avoiding the complexity of the full...
NExT: NUS-Tsinghua Center for Extreme Search of User-Generated Content
Found in: IEEE Multimedia
By Tat-Seng Chua,Huanbo Luan,Maosong Sun,Shiqiang Yang
Issue Date:July 2012
pp. 81-87
The Web has revolutionized the way we create, disseminate, and consume information. Users have changed from passive recipients of information to active content consumers and creators, and the nature of information has also changed from static text to dynam...
Modeling Social Annotations via Latent Reason Identification
Found in: IEEE Intelligent Systems
By Xiance Si,Zhiyuan Liu,Maosong Sun
Issue Date:November 2010
pp. 42-49
The probabilistic Tag Allocation Model (TAM) explains social tags by modeling the latent reasoning behind each tag in order to disambiguate them and identify noise.
Tag Allocation Model: Model Noisy Social Annotations by Reason Finding
Found in: Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on
By Xiance Si, Maosong Sun
Issue Date:September 2010
pp. 413-416
We propose the Tag Allocation Model (TAM) to model social annotation data. TAM is a probabilistic generative model, its key feature is finding the latent reason for each tag. A latent reason can be any discrete features of the document (such as words) or a...
Text Classification Based on Transfer Learning and Self-Training
Found in: International Conference on Natural Computation
By Yabin Zheng, Shaohua Teng, Zhiyuan Liu, Maosong Sun
Issue Date:October 2008
pp. 363-367
Traditional text classification methods make a basic assumption: the training and test set are homologous, while this na?ve assumption may not hold in the real world, especially in the web environment. Documents on the web change from time to time, pre-tra...
Multi-modal Multi-label Semantic Indexing of Images Using Unlabeled Data
Found in: Advanced Language Processing and Web Information Technology, International Conference on
By Wei Li, Maosong Sun
Issue Date:July 2008
pp. 204-209
Automatic image annotation (AIA) refers to the association of words to whole images which is considered as a promising and effective approach to bridge the semantic gap between low-level visual features and high-level semantic concepts. In this paper, we f...
Leveraging World Knowledge in Chinese Text Classification
Found in: Advanced Language Processing and Web Information Technology, International Conference on
By Shu Xu, Maosong Sun
Issue Date:August 2007
pp. 33-38
In state-of-the-art Text Classification (TC) approaches, only features explicitly mentioned in training set are taken into consideration, but after several decades? endeavor, it seems that these approaches have all reached a plateau. In this paper, we prop...
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
Found in: ACM Transactions on Asian Language Information Processing (TALIP)
By Lixing Xie, Liyun Ru, Maosong Sun, Yabin Zheng, Yang Zhang, Zhiyuan Liu
Issue Date:December 2011
pp. 1-26
Nowadays, user behavior analysis and collaborative filtering have drawn a large body of research in the machine learning community. The goal is either to enhance the user experience or discover useful information hidden in the data. In this article, we con...
PLDA+: Parallel latent dirichlet allocation with data placement and pipeline processing
Found in: ACM Transactions on Intelligent Systems and Technology (TIST)
By Edward Y. Chang, Maosong Sun, Yuzhou Zhang, Zhiyuan Liu
Issue Date:April 2011
pp. 1-18
Previous methods of distributed Gibbs sampling for LDA run into either memory or communication bottlenecks. To improve scalability, we propose four strategies: data placement, pipeline processing, word bundling, and priority-based scheduling. Experiments s...
Asymmetrical query recommendation method based on bipartite network resource allocation
Found in: Proceeding of the 17th international conference on World Wide Web (WWW '08)
By Maosong Sun, Zhiyuan Liu
Issue Date:April 2008
pp. 1-7
This paper presents a new query recommendation method that generates recommended query list by mining large-scale user logs. Starting from the user logs of click-through data, we construct a bipartite network where the nodes on one side correspond to uniqu...