Search For:

Displaying 1-20 out of 20 total
On the Deep Order-Preserving Submatrix Problem: A Best Effort Approach
Found in: IEEE Transactions on Knowledge and Data Engineering
By Byron J. Gao,Obi L. Griffith,Martin Ester,Hui Xiong,Qiang Zhao,Steven J.M. Jones
Issue Date:February 2012
pp. 309-325
HASH(0x2982ae4)
 
The Minimum Consistent Subset Cover Problem: A Minimization View of Data Mining
Found in: IEEE Transactions on Knowledge and Data Engineering
By Byron J. Gao,Martin Ester,Hui Xiong,Jin-Yi Cai,Oliver Schulte
Issue Date:March 2013
pp. 690-703
In this paper, we introduce and study the minimum consistent subset cover (MCSC) problem. Given a finite ground set X and a constraint t, find the minimum number of consistent subsets that cover X, where a subset of X is consistent if it satisfies t. The M...
 
Turning Clusters into Patterns: Rectangle-Based Discriminative Data Description
Found in: Data Mining, IEEE International Conference on
By Byron J. Gao, Martin Ester
Issue Date:December 2006
pp. 200-211
The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as human-comprehensible patterns from which end-users can gain intuitions and insights. Yet not all data mining methods produce such readily unders...
 
User-Centric Organization of Search Results
Found in: IEEE Internet Computing
By Byron J. Gao,David Buttler,David C. Anastasiu,Shuaiqiang Wang,Peng Zhang,Joey Jan
Issue Date:May 2013
pp. 52-59
The authors investigated the use of microblogs—or weibos—and related censorship practices using 111 million microblogs collected between 1 January and 30 June 2012. Using a matched case-control study design helped researchers determin...
 
Building Community Wikipedias: A Machine-Human Partnership Approach
Found in: Data Engineering, International Conference on
By Pedro DeRose, Xiaoyong Chai, Byron J. Gao, Warren Shen, AnHai Doan, Philip Bohannon, Xiaojin Zhu
Issue Date:April 2008
pp. 646-655
The rapid growth of Web communities has motivated many solutions for building community data portals. These solutions follow roughly two approaches. The first approach (e.g., Libra, Citeseer, Cimple) employs semi-automatic methods to extract and integrate ...
 
A Model for Discovering Correlations of Ubiquitous Things
Found in: 2013 IEEE International Conference on Data Mining (ICDM)
By Lina Yao,Quan Z. Sheng,Byron J. Gao,Anne H.H. Ngu,Xue Li
Issue Date:December 2013
pp. 1253-1258
With recent advances in radio-frequency identification (RFID), wireless sensor networks, and Web services, physical things are becoming an integral part of the emerging ubiquitous Web. Correlation discovery for ubiquitous things is critical for many import...
 
Enabling Fast Lazy Learning for Data Streams
Found in: Data Mining, IEEE International Conference on
By Peng Zhang,Byron J. Gao,Xingquan Zhu,Li Guo
Issue Date:December 2011
pp. 932-941
Lazy learning, such as k-nearest neighbor learning, has been widely applied to many applications. Known for well capturing data locality, lazy learning can be advantageous for highly dynamic and complex learning environments such as data streams. Yet its h...
 
E-Tree: An Efficient Indexing Structure for Ensemble Models on Data Streams
Found in: IEEE Transactions on Knowledge and Data Engineering
By Peng Zhang,Chuan Zhou,Peng Wang,Byron J. Gao,Xingquan Zhu,Li Guo
Issue Date:February 2014
pp. 1
Ensemble learning has become a common tool for data stream classification, being able to handle large volumes of stream data and concept drifting. Previous studies focus on building accurate ensemble models from stream data. However, a linear scan of a lar...
 
Discovering significant OPSM subspace clusters in massive gene expression data
Found in: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '06)
By Byron J. Gao, Martin Ester, Obi L. Griffith, Steven J. M. Jones
Issue Date:August 2006
pp. 922-928
Order-preserving submatrixes (OPSMs) have been accepted as a biologically meaningful subspace cluster model, capturing the general tendency of gene expressions across a subset of conditions. In an OPSM, the expression levels of all genes induce the same li...
     
Cager: a framework for cross-page search
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Byron J. Gao, Qi Kang, Zhumin Chen
Issue Date:October 2012
pp. 2704-2706
Existing search engines have page as the unit of information of retrieval. They typically return a ranked list of pages, each being a search result containing the query keywords. This within-one-page constraint disallows utilization of relationship informa...
     
Information-complete and redundancy-free keyword search over large data graphs
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Byron J. Gao, Qi Kang, Zhumin Chen
Issue Date:October 2012
pp. 2639-2642
Keyword search over graphs has a wide array of applications in querying structured, semi-structured and unstructured data. Existing models typically use minimal trees or bounded subgraphs as query answers. While such models emphasize relevancy, they would ...
     
Polygene-based evolution: a novel framework for evolutionary algorithms
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Byron J. Gao, Guibao Cao, Shuaiqiang Wang, Shuangling Wang, Yilong Yin
Issue Date:October 2012
pp. 2263-2266
In this paper, we introduce polygene-based evolution, a novel framework for evolutionary algorithms (EAs) that features distinctive operations in the evolution process. In traditional EAs, the primitive evolution unit is gene, where genes are independent c...
     
Learning to rank for hybrid recommendation
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Byron J. Gao, Jiankai Sun, Jun Ma, Shuaiqiang Wang
Issue Date:October 2012
pp. 2239-2242
Most existing recommender systems can be classified into two categories: collaborative filtering and content-based filtering. Hybrid recommender systems combine the advantages of the two for improved recommendation performance. Traditional recommender syst...
     
Adapting vector space model to ranking-based collaborative filtering
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Byron J. Gao, Jiankai Sun, Jun Ma, Shuaiqiang Wang
Issue Date:October 2012
pp. 1487-1491
Collaborative filtering (CF) is an effective technique addressing the information overload problem. Recently ranking-based CF methods have shown advantages in recommendation accuracy, being able to capture the preference similarity between users even if th...
     
A framework for personalized and collaborative clustering of search results
Found in: Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM '11)
By Byron J. Gao, David Buttler, David C. Anastasiu
Issue Date:October 2011
pp. 573-582
How to organize and present search results plays a critical role in the utility of search engines. Due to the unprecedented scale of the Web and diversity of search results, the common strategy of ranked lists has become increasingly inadequate, and cluste...
     
ClusteringWiki: personalized and collaborative clustering of search results
Found in: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information (SIGIR '11)
By Byron J. Gao, David Buttler, Dragos C. Anastasiu
Issue Date:July 2011
pp. 1263-1264
How to organize and present search results plays a critical role in the utility of search engines. Due to the unprecedented scale of the Web and diversity of search results, the common strategy of ranked lists has become increasingly inadequate, and cluste...
     
Parallel learning to rank for information retrieval
Found in: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information (SIGIR '11)
By Byron J. Gao, Hady W. Lauw, Ke Wang, Shuaiqiang Wang
Issue Date:July 2011
pp. 1083-1084
Learning to rank represents a category of effective ranking methods for information retrieval. While the primary concern of existing research has been accuracy, learning efficiency is becoming an important issue due to the unprecedented availability of lar...
     
The gardener's problem for web information monitoring
Found in: Proceeding of the 18th ACM conference on Information and knowledge management (CIKM '09)
By Byron J. Gao, David C. Anastasiu, Mingji Xia, Walter Cai
Issue Date:November 2009
pp. 1525-1528
We introduce and theoretically study the Gardener's problem that well models many web information monitoring scenarios, where numerous dynamically changing web sources are monitored and local information needs to be periodically updated under communication...
     
Optimizing complex extraction programs over evolving text data
Found in: Proceedings of the 35th SIGMOD international conference on Management of data (SIGMOD '09)
By AnHai Doan, Byron J. Gao, Fei Chen, Jun Yang, Raghu Ramakrishnan
Issue Date:June 2009
pp. 3-4
Most information extraction (IE) approaches have considered only static text corpora, over which we apply IE only once. Many real-world text corpora however are dynamic. They evolve over time, and so to keep extracted information up to date we often must a...
     
The minimum consistent subset cover problem and its applications in data mining
Found in: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '07)
By Byron J. Gao, Hui Xiong, Jin-Yi Cai, Martin Ester, Oliver Schulte
Issue Date:August 2007
pp. 310-319
In this paper, we introduce and study the Minimum Consistent Subset Cover (MCSC) problem. Given a finite ground set X and a constraint t, find the minimum number of consistent subsets that cover X, where a subset of X is consistent if it satisfies t. The M...
     
 1