The Community for Technology Leaders
2013 IEEE 13th International Conference on Data Mining (2006)
Hong Kong
Dec. 18, 2006 to Dec. 22, 2006
ISSN: 1550-4786
ISBN: 0-7695-2701-9
TABLE OF CONTENTS
Introduction

Invited speakers (PDF)

pp. xxvii
Introduction

Steering Committee (PDF)

pp. xviii

Program Committee (PDF)

pp. xix-xxii

Non-PC Reviewers (PDF)

pp. xxiii-xxv

Tutorials (PDF)

pp. xxviii
Invited Papers

Neuroscience: New Insights for AI? (Abstract)

Tomaso Poggio , Massachusetts Institute of Technology, USA
pp. 3-5
Regular Papers

An Information Theoretic Approach to Detection of Minority Subsets in Database (Abstract)

Shin Ando , Yokohama National University, Japan
Einoshin Suzuki , Kyushu University, Japan
pp. 11-20

Learning to Use a Learned Model: A Two-Stage Approach to Classification (Abstract)

Maria-Luiza Antonie , University of Alberta, Canada
Osmar R. Zaiane , University of Alberta, Canada
Robert C. Holte , University of Alberta, Canada
pp. 33-42

Hierarchical Classification by Expected Utility Maximization (Abstract)

Andreas Nurnberger , University Magdeburg, Germany
Eyke Hullermeier , University Magdeburg, Germany
Korinna Bade , University Magdeburg, Germany
pp. 43-52

COALA: A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity (Abstract)

James Bailey , University of Melbourne, Australia
Eric Bae , University of Melbourne, Australia
pp. 53-62

Cluster Ranking with an Application to Mining Mailbox Networks (Abstract)

Vladimir Soroka , IBM Research Lab, Israel
Ziv Bar-Yossef , Technion and Google Inc., Israel
Ronny Lempel , IBM Research Lab, Israel
Ido Guy , Technion and IBM Research Lab, Israel
Yoelle S. Maarek , Google Inc., Israel
pp. 63-74

Large Scale Detection of Irregularities in Accounting Data (Abstract)

Krishna Kumaraswamy , PricewaterhouseCoopers LLP, USA
Markus G. Anderle , PricewaterhouseCoopers LLP, USA
David M. Steier , PricewaterhouseCoopers LLP, USA
Rohit Kumar , PricewaterhouseCoopers LLP, USA
Stephen Bay , PricewaterhouseCoopers LLP, USA
pp. 75-86

Adaptive Blocking: Learning to Scale Up Record Linkage (Abstract)

Beena Kamath , Google Inc., USA
Raymond J. Mooney , University of Texas at Austin, USA
Mikhail Bilenko , One Microsoft Way, USA
pp. 87-96

Adaptive Parallel Graph Mining for CMP Architectures (Abstract)

Gregory Buehrer , The Ohio State University, USA
Srinivasan Parthasarathy , The Ohio State University, USA
Yen-Kuang Chen , Intel Corporation, USA
pp. 97-106

Meta Clustering (Abstract)

Casey Smith , Cornell University, USA
Rich Caruana , Cornell University, USA
Nam Nguyen , Cornel University, USA
Mohamed Elhawary , Cornell University, USA
pp. 107-118

Mixed-Drove Spatio-Temporal Co-occurence Pattern Mining: A Summary of Results (Abstract)

Shashi Shekhar , University of Minnesota, USA
James A. Shine , U.S. Army ERDC, USA
Mete Celik , University of Minnesota, USA
James P. Rogers , U.S. Army ERDC, USA
Jin Soung Yoo , University of Minnesota, USA
pp. 119-128

An Interactive Semantic Video Mining and Retrieval Platform--Application in Transportation Surveillance Video for Incident Detection (Abstract)

Xin Chen , University of Alabama at Birmingham, USA
Chengcui Zhang , University of Alabama at Birmingham, USA
pp. 129-138

Tolerance Closed Frequent Itemsets (Abstract)

Yiping Ke , The Hong Kong University of Science and Technology, Hong Kong
James Cheng , The Hong Kong University of Science and Technology, Hong Kong
Wilfred Ng , The Hong Kong University of Science and Technology, Hong Kong
pp. 139-148

Active Learning to Maximize Area Under the ROC Curve (Abstract)

Stephen Scott , University of Nebraska, USA
Deng Kun , University of Nebraska, USA
Matt Culver , University of Nebraska, USA
pp. 149-158

Rapid Identification of Column Heterogeneity (Abstract)

Bing Tian Dai , National Univ. of Singapore, Singapore
Beng Chin Ooi , National Univ. of Singapore, Singapore
Suresh Venkatasubramanian , AT&T Labs--Research
Nick Koudas , University of Toronto
Divesh Srivastava , AT&T Labs-Research
pp. 159-170

Data Mining Approaches to Criminal Career Analysis (Abstract)

Tim K. Cocx , Leiden University, The Netherlands
Walter A. Kosters , Leiden University, The Netherlands
Jeroen F. J. Laros , Leiden University, The Netherlands
Jeroen S. de Bruin , Leiden University, The Netherlands
Joost N. Kok , Leiden University, The Netherlands
pp. 171-177

Biclustering Protein Complex Interactions with a Biclique Finding Algorithm (Abstract)

Chris Ding , Lawrence Berkeley Nat'l Lab, USA
Stephen R. Holbrook , Lawrence Berkeley Nat'l Lab, USA
Ya Zhang , University of Kansas, USA
Tao Li , Florida International University, USA
pp. 178-187

Turning Clusters into Patterns: Rectangle-Based Discriminative Data Description (Abstract)

Martin Ester , Simon Fraser University, Canada
Byron J. Gao , Simon Fraser University, Canada
pp. 200-211

Converting Output Scores from Outlier Detection Algorithms into Probability Estimates (Abstract)

Jing Gao , Michigan State University, USA
Pang-Ning Tan , Michigan State University, USA
pp. 212-221

Personalization in Context: Does Context Matter When Building Personalized Customer Models? (Abstract)

A. Tuzhilin , New York University, USA
C. Palmisano , Politecnico di Bari, Italy
M. Gorgoglione , Politecnico di Bari, Italy
pp. 222-231

Bregman Bubble Clustering: A Robust, Scalable Framework for Locating Multiple, Dense Regions in Data (Abstract)

Gunjan Gupta , University of Texas at Austin, USA
Joydeep Ghosh , University of Texas at Austin, USA
pp. 232-243

Optimal Segmentation Using Tree Models (Abstract)

Aristides Gionis , University of Helsinki, Finland
Robert Gwadera , University of Helsinki, Finland
Heikki Mannila , University of Helsinki, Finland
pp. 244-253

Mining for Tree-Query Associations in a Graph (Abstract)

Eveline Hoekx , Hasselt University and Transnational University of Limburg, Belgium
Jan Van den Bussche , Hasselt University and Transnational University of Limburg, Belgium
pp. 254-264

Keyphrase Extraction Using Semantic Networks Structure Analysis (Abstract)

Zhi Zhou , Chinese Academy of Sciences, China
Tiejun Huang , Chinese Academy of Sciences, China
Yonghong Tian , Chinese Academy of Sciences, China
Chong Huang , Chinese Academy of Sciences, China
Charles X. Ling , University of Western Ontario, Canada
pp. 275-284

Subjectivity Categorization of Weblog with Part-of-Speech Based Smoothing (Abstract)

Xuanhui Wang , University of Illinois at Urbana-Champaign, USA
Shen Huang , Microsoft Research Asia, China
Zheng Chen , Microsoft Research Asia, China
Hua-Jun Zeng , Microsoft Research Asia, China
Jian-Tao Sun , Microsoft Research Asia, China
pp. 285-294

Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval (Abstract)

Yan Rui Huang , York University, Canada
Yang Liu , York University, Canada
Xiangji Huang , York University, Canada
Josiah Poon , University of Sydney, Australia
Aijun An , York University, Canada
Miao Wen , York University, Canada
pp. 295-306

Improving Personalization Solutions through Optimal Segmentation of Customer Bases (Abstract)

Alexander Tuzhilin , New York University, USA
Tianyi Jiang , New York University, USA
pp. 307-318

Secure Distributed k-Anonymous Pattern Mining (Abstract)

Maurizio Atzori , University of Pisa and ISTI-CNR, Italy
Wei Jiang , Purdue University, USA
pp. 319-329

Dimension Reduction for Supervised Ordering (Abstract)

Toshihiro Kamishima , National Institute of Advanced Industrial Science and Technology (AIST), Japan
Shotaro Akaho , National Institute of Advanced Industrial Science and Technology (AIST), Japan
pp. 330-339

Incremental Mining of Frequent Query Patterns from XML Queries for Caching (Abstract)

Jianhua Feng , Tsinghua University, China
Yong Zhang , Tsinghua University, China
Lizhu Zhou , Tsinghua University, China
Jianyong Wang , Tsinghua University, China
Guoliang Li , Tsinghua University, China
pp. 350-361

The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering (Abstract)

Chris Ding , University of California, Berkeley, USA
Tao Li , Florida International University, USA
pp. 362-371

Integrating Features from Different Sources for Music Information Retrieval (Abstract)

Tao Li , Florida International University, USA
Shenghuo Zhu , NEC Laboratories America, USA
Mitsunori Ogihara , University of Rochester, USA
pp. 372-381

How Bayesians Debug (Abstract)

Zeng Lian , Brigham Young University, USA
Jiawei Han , University of Illinois-UC, USA
Chao Liu , University of Illinois-UC, USA
pp. 382-393

On the Use of Structure and Sequence-Based Features for Protein Classification and Retrieval (Abstract)

Srinivasan Parthasarathy , The Ohio State University, USA
Keith Marsolo , The Ohio State University, USA
pp. 394-403

P3C: A Robust Projected Clustering Algorithm (Abstract)

Martin Ester , Simon Fraser University, Canada
Jorg Sander , University of Alberta, Canada
Gabriela Moise , University of Alberta, Canada
pp. 414-425

Frequent Closed Itemset Mining Using Prefix Graphs with an Efficient Flow-Based Pruning Strategy (Abstract)

H.D.K. Moonesinghe , Michigan State University, USA
Pang-Ning Tan , Michigan State University, USA
Samah Fodeh , Michigan State University, USA
pp. 426-435

Efficient Clustering of Uncertain Data (Abstract)

Chun Kit Chui , The University of Hong Kong, Hong Kong
Wang Kay Ngai , The University of Hong Kong, Hong Kong
Kevin Y. Yip , Yale University, USA
Michael Chau , The University of Hong Kong, Hong Kong
Reynold Cheng , Hong Kong Polytechnic University, Hong Kong
Ben Kao , The University of Hong Kong, Hong Kong
pp. 436-445

A Data Mining Approach for Capacity Building of Stakeholders in Integrated Flood Management (Abstract)

Natasa Manojlovic , Technische Universitaet Hamburg-Harburg (TUHH), Germany
Friedrich Mayer-Lindenberg , Technische Universitaet Hamburg-Harburg (TUHH), Germany
Peter Owotoki , Technische Universitaet Hamburg-Harburg (TUHH), Germany
Erik Pasche , Technische Universitaet Hamburg-Harburg (TUHH), Germany
pp. 446-455

Local Correlation Tracking in Time Series (Abstract)

Spiros Papadimitriou , IBM T.J. Watson Research Center, USA
Jimeng Sun , Carnegie Mellon University, USA
Philip S. Yu , IBM T.J. Watson Research Center, USA
pp. 456-465

Who Thinks Who Knows Who? Socio-cognitive Analysis of Email Networks (Abstract)

Jaideep Srivastava , University of Minnesota, USA
Sandeep Mane , University of Minnesota, USA
Nishith Pathak , University of Minnesota, USA
pp. 466-477

An Efficient Reference-Based Approach to Outlier Detection in Large Datasets (Abstract)

Yong Gao , University of British Columbia Okanagan, Canada
Yaling Pei , University of Alberta, Canada
Osmar R. Zaiane , University of Alberta, Canada
pp. 478-487

Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems (Abstract)

Roberto Perdisci , University of Cagliari, Italy; Georgia Institute of Technology, USA
Wenke Lee , Georgia Institute of Technology, USA
Guofei Gu , Georgia Institute of Technology, USA
pp. 488-498

Relational Ensemble Classification (Abstract)

Christine Preisach , University of Freiburg, Germany
Lars Schmidt-Thieme , University of Freiburg, Germany
pp. 499-509

Discovering Partial Orders in Binary Data (Abstract)

Philip S. Yu , IBM T.J. Watson Research Center, USA
Deepak Rajan , IBM T.J. Watson Research Center, USA
pp. 510-521

Stability Region Based Expectation Maximization for Model-based Clustering (Abstract)

Hsiao-Dong Chiang , Cornell University, USA
Chandan K. Reddy , Cornell University, USA
Bala Rajaratnam , Cornell University, USA
pp. 522-531

Co-clustering Documents and Words Using Bipartite Isoperimetric Graph Partitioning (Abstract)

Manjeet Rege , Wayne State University, USA
Ming Dong , Wayne State University, USA
Farshad Fotouhi , Wayne State University, USA
pp. 532-541

Latent Dirichlet Co-Clustering (Abstract)

Evangelos E. Milios , Dalhousie University, Canada
M. Mahdi Shafiei , Dalhousie University, Canada
pp. 542-551

Latent Friend Mining from Blog Data (Abstract)

Zheng Chen , Microsoft Research Asia, China
Dou Shen , Hong Kong University of Science and Technology, Hong Kong
Jian-Tao Sun , Microsoft Research Asia, China
Qiang Yang , Hong Kong University of Science and Technology, Hong Kong
pp. 552-561

The PDD Framework for Detecting Categories of Peculiar Data (Abstract)

Ken Konkel , University of Regina, Canada
Yiyu Yao , University of Regina, Canada
Mahesh Shrestha , University of Regina, Canada
Liqiang Geng , University of Regina, Canada
Howard J. Hamilton , University of Regina, Canada
pp. 562-571

Entity Resolution with Markov Logic (Abstract)

Parag Singla , University of Washington Seattle, USA
Pedro Domingos , University of Washington Seattle, USA
pp. 572-582

Boosting Kernel Models for Regression (Abstract)

Xin Yao , University of Birmingham, UK
Ping Sun , University of Birmingham, UK
pp. 583-591

Boosting for Learning Multiple Classes with Imbalanced Class Distribution (Abstract)

Mohamed S. Kamel , University of Waterloo, Canada
Yang Wang , Pattern Discovery Software Systems Ltd., Canada
Yanmin Sun , University of Waterloo, Canada
pp. 592-602

What is the Dimension of Your Binary Data? (Abstract)

Heikki Mannila , University of Helsinki and Helsinki University of Technology, Finland
Aristides Gionis , University of Helsinki and Helsinki University of Technology, Finland
Taneli Mielikainen , University of Helsinki and Helsinki University of Technology, Finland
Nikolaj Tatti , University of Helsinki and Helsinki University of Technology, Finland
pp. 603-612

Fast Random Walk with Restart and Its Applications (Abstract)

Hanghang Tong , Carnegie Mellon University, USA
Jia-Yu Pan , Carnegie Mellon University, USA
Christos Faloutsos , Carnegie Mellon University, USA
pp. 613-622

Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining (Abstract)

Xiaopeng Xi , University of California, Riverside, USA
Dah-Jye Lee , Brigham Young University, USA
Eamonn Keogh , University of California, Riverside, USA
Ken Ueno , Toshiba Corporation, Japan
pp. 623-632

Lazy Associative Classification (Abstract)

Adriano Veloso , Federal University of Minas Gerais, Brazil
Mohammed J. Zaki , Rensselaer Polytechnic Institute, USA
Wagner Meira Jr. , Federal University of Minas Gerais, Brazil
pp. 645-654

Geometrically Inspired Itemset Mining (Abstract)

Sanjay Chawla , University of Sydney, Australia
Florian Verhein , University of Sydney, Australia
pp. 655-666

Finding "Who Is Talking to Whom" in VoIP Networks via Progressive Stream Clustering (Abstract)

Philip S. Yu , IBM T.J. Watson Research Center
Eric Bouillet , IBM T.J. Watson Research Center
Aris Anagnostopoulos , Yahoo! Research
Michail Vlachos , IBM T.J. Watson Research Center
Olivier Verscheure , IBM T.J. Watson Research Center
pp. 667-677

Comparison of Descriptor Spaces for Chemical Compound Retrieval and Classification (Abstract)

Nikil Wale , University of Minnesota, USA
George Karypis , University of Minnesota, USA
pp. 678-689

Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning (Abstract)

Michael D. Gordon , University of Michigan, USA
Ji Zhu , University of Michigan, USA
Li Wang , University of Michigan, USA
pp. 690-700

LOCI: Load Shedding through Class-Preserving Data Acquisition (Abstract)

Baile Shi , Fudan University, China
Wei Wang , Fudan University, China
Philip S. Yu , IBM T.J. Watson Research Center, USA
Peng Wang , Fudan University, China
Haixun Wang , IBM T.J. Watson Research Center, USA
pp. 701-710

SAXually Explicit Images: Finding Unusual Shapes (Abstract)

Xiaopeng Xi , University of California, Riverside, USA
Li Wei , University of California, Riverside, USA
Eamonn Keogh , University of California, Riverside, USA
pp. 711-720

A Novel Scalable Algorithm for Supervised Subspace Learning (Abstract)

Benyu Zhang , Microsoft Research Asia, China
Ning Liu , Microsoft Research Asia, China
Jun Yan , Microsoft Research Asia, China
Zheng Chen , Microsoft Research Asia, China
Shuicheng Yan , University of Illinois at Urbana Champaign, USA
Qiang Yang , Hong Kong University of Science and Technology, Hong Kong
pp. 721-730

A Novel Method for Detecting Outlying Subspaces in High-dimensional Databases Using Genetic Algorithm (Abstract)

Qigang Gao , Dalhousie University, Canada
Hai Wang , Saint Mary's University, Canada
Ji Zhang , Dalhousie University, Canada
pp. 731-740

Discovering Unrevealed Properties of Probability Estimation Trees: On Algorithm Selection and Performance Explanation (Abstract)

Xiaojing Yuan , Huston University
Wei Fan , IBM T.J. Watson Research, USA
Zujia Xu , Dillard University
Bill Buckles , Tulane University, USA
Kun Zhang , Tulane University, USA
pp. 741-752

Forecasting Skewed Biased Stochastic Ozone Days: Analyses and Solutions (Abstract)

Ian Davidson , University at Albany, USA
Wei Fan , IBM T.J. Watson Research, USA
Kun Zhang , Tulane University, USA
Xiaojing Yuan , University of Houston, USA
Xiangshang Li , Tulane University, USA
pp. 753-764

Identifying Follow-Correlation Itemset-Pairs (Abstract)

Zifang Huang , Beihang University, China
Xiaofeng Zhu , Guangxi Normal University, China
Shichao Zhang , Guangxi Normal University, China; Beihang University, China
Jilian Zhang , Guangxi Normal University, China
pp. 765-774

On the Lower Bound of Local Optimums in K-Means Algorithm (Abstract)

Anthony K.H. Tung , National University of Singapore, Singapore
Zhenjie Zhang , National University of Singapore, Singapore
Bing Tian Dai , National University of Singapore, Singapore
pp. 775-786
Short Papers

Fast On-line Kernel Learning for Trees (Abstract)

Giovanni Da San Martino , Universita di Padova, Italy
Fabio Aiolli , Universita di Padova, Italy
Alessandro Moschitti , Universita di Roma "Tor Vergata", Italy
Alessandro Sperduti , Universita di Padova, Italy
pp. 787-791

bitSPADE: A Lattice-based Sequential Pattern Mining Algorithm Using Bitmap Representation (Abstract)

Emmanuel Viennet , Institut Galilee Universite Paris, France
Sujeevan Aseervatham , Institut Galilee Universite Paris, France
Aomar Osmani , Institut Galilee Universite Paris, France
pp. 792-797

Decision Trees for Functional Variables (Abstract)

David Madigan , Rutgers University, USA
Suhrid Balakrishnan , Rutgers University, USA
pp. 798-802

Mining Latent Associations of Objects Using a Typed Mixture Model--A Case Study on Expert/Expertise Mining (Abstract)

Shenghua Bao , Shanghai Jiao Tong University, China
Bing Liu , University of Illinois at Chicago, USA
Hang Li , Microsoft Research Asia, China
Yunbo Cao , Microsoft Research Asia, China
Yong Yu , Shanghai Jiao Tong University, China
pp. 803-807

Semantic Kernels for Text Classification Based on Topological Measures of Feature Similarity (Abstract)

Roberto Basili , University of Rome 'Tor Vergata', Italy
Marco Cammisa , University of Rome 'Tor Vergata', Italy
Alessandro Moschitti , University of Rome 'Tor Vergata', Italy
Stephan Bloehdorn , University of Karlsruhe, Germany
pp. 808-812

Mining Maximal Generalized Frequent Geographic Patterns with Knowledge Constraints (Abstract)

Sandro Camargo , Universidade Federal do Rio Grande do Sul (UFRGS), Brazil
Bart Kuijpers , Hasselt University and Transnational University of Limburg, Belgium
Paulo Engel , Universidade Federal do Rio Grande do Sul (UFRGS), Brazil
Joao Valiati , Universidade Federal do Rio Grande do Sul (UFRGS), Brazil
Luis O. Alvares , Universidade Federal do Rio Grande do Sul (UFRGS), Brazil
Vania Bogorny , Universidade Federal do Rio Grande do Sul (UFRGS), Brazil
pp. 813-817

Pattern Mining in Frequent Dynamic Subgraphs (Abstract)

Hans-Peter Kriegel , Ludwig-Maximilians-Universitat, Germany
Karsten M. Borgwardt , Ludwig-Maximilians-Universitat, Germany
Peter Wackersreuther , Ludwig-Maximilians-Universitat, Germany
pp. 818-822

Discovery of Collocation Episodes in Spatiotemporal Data (Abstract)

Nikos Mamoulis , The University of Hong Kong, Hong Kong
David W. Cheung , The University of Hong Kong, Hong Kong
Huiping Cao , The University of Hong Kong, Hong Kong
pp. 823-827

Getting the Most Out of Ensemble Selection (Abstract)

Rich Caruana , Cornell University, USA
Alexandru Niculescu-Mizil , Cornell University, USA
Art Munson , Cornell University, USA
pp. 828-833

Diverse Topic Phrase Extraction through Latent Semantic Analysis (Abstract)

Benyu Zhang , Microsoft Research Asia, China
Qiang Yang , Hong Kong University of Science and Technology, Hong Kong
Jun Yan , Microsoft Research Asia, China
Zheng Chen , Microsoft Research Asia, China
Jilin Chen , University of Minnesota, USA
pp. 834-838

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery (Abstract)

Philip S. Yu , IBM T.J. Watson Research Center, USA
Hong Cheng , University of Illinois at Urbana-Champaign, USA
Jiawei Han , University of Illinois at Urbana-Champaign, USA
pp. 839-844

Belief Propagation in Large, Highly Connected Graphs for 3D Part-Based Object Recognition (Abstract)

Jude Shavlik , University of Wisconsin-Madison, USA
Frank DiMaio , University of Wisconsin-Madison, USA
pp. 845-850

A Framework for Regional Association Rule Mining in Spatial Datasets (Abstract)

Wei Ding , University of Houston, USA
Xiaojing Yuan , University of Houston, USA
Christoph F. Eick , University of Houston, USA
Jing Wang , University of Houston, USA
pp. 851-856

Mining Generalized Graph Patterns Based on User Examples (Abstract)

Pavel Dmitriev , Cornell University, USA
Carl Lagoze , Cornell University, USA
pp. 857-862

An Experimental Investigation of Graph Kernels on a Collaborative Recommendation Task (Abstract)

Luh Yen , Universite catholique de Louvain, Belgium
Alain Pirotte , Universite catholique de Louvain, Belgium
Francois Fouss , Universite catholique de Louvain, Belgium
Marco Saerens , Universite catholique de Louvain, Belgium
pp. 863-868

A Balanced Ensemble Approach to Weighting Classifiers for Text Classification (Abstract)

Huan Liu , Arizona State University, USA
Haixun Wang , IBM T.J. Watson Research Center, USA
Gabriel Pui Cheong Fung , The Chinese University of Hong Kong, Hong Kong
David W. Cheung , The University of Hong Kong, Hong Kong
Jeffrey Xu Yu , The Chinese University of Hong Kong, Hong Kong
pp. 869-873

Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis (Abstract)

William M. Pottenger , Rutgers University, USA
Murat Can Ganiz , Lehigh University, USA
Mooi Choo Chuah , Lehigh University, USA
Sudhan Kanitkar , Lehigh University, USA
pp. 874-879

Star-Structured High-Order Heterogeneous Data Co-clustering Based on Consistent Information Theory (Abstract)

Tie-Yan Liu , Microsoft Research Asia, China
Bin Gao , Microsoft Research Asia, China
Wei-Ying Ma , Microsoft Research Asia, China
pp. 880-884

GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space (Abstract)

Huahai He , University of California, Santa Barbara, USA
Ambuj K. Singh , University of California, Santa Barbara, USA
pp. 885-890

A Feature Selection and Evaluation Scheme for Computer Virus Detection (Abstract)

Nathalie Japkowicz , University of Ottawa, Canada
Olivier Henchiri , University of Ottawa, Canada
pp. 891-895

Constructing Ensembles for Better Ranking (Abstract)

Jin Huang , University of Ottawa, Canada
Charles X. Ling , The University of Western Ontario, Canada
pp. 902-906

TRIAS--An Algorithm for Mining Iceberg Tri-Lattices (Abstract)

Gerd Stumme , University of Kassel, Germany; Research Center L3S, Germany
Christoph Schmitz , University of Kassel, Germany
Andreas Hotho , University of Kassel, Germany
Bernhard Ganter , Dresden University of Technology, Germany
Robert Jaschke , University of Kassel, Germany; Research Center L3S, Germany
pp. 907-911

Intelligent Icons: Integrating Lite-Weight Data Mining and Visualization into GUI Operating Systems (Abstract)

Xiaopeng Xi , University of California, Riverside, USA
Jin Shieh , University of California, Riverside, USA
Li Wei , University of California, Riverside, USA
Eamonn Keogh , University of California, Riverside, USA
Stefano Lonardi , University of California, Riverside, USA
Scott Sirowy , University of California, Riverside, USA
pp. 912-916

COSMIC: Conceptually Specified Multi-Instance Clusters (Abstract)

Arthur Zimek , Ludwig-Maximilians-Universitat, Germany
Hans-Peter Kriegel , Ludwig-Maximilians-Universitat, Germany
Matthias Schubert , Ludwig-Maximilians-Universitat, Germany
Alexey Pryakhin , Ludwig-Maximilians-Universitat, Germany
pp. 917-921

Direct Marketing When There Are Voluntary Buyers (Abstract)

Daymond Ling , Canadian Imperial Bank of Commerce
Ke Wang , Simon Fraser University, Canada
Yi-Ting Lai , Simon Fraser University, Canada
Hua Shi , Canadian Imperial Bank of Commerce
Jason Zhang , Canadian Imperial Bank of Commerce
pp. 922-927

DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams (Abstract)

Carson Kai-Sang Leung , The University of Manitoba, Canada
Quamrul I. Khan , The University of Manitoba, Canada
pp. 928-932

Searching for Pattern Rules (Abstract)

Guichong Li , University of Regina, Canada
Howard J. Hamilton , University of Regina, Canada
pp. 933-937

Adding Semantics to Email Clustering (Abstract)

Hua Li , Microsoft Research Asia, China
Qiang Yang , Hong Kong University of Science and Technology, Hong Kong
Zheng Chen , Microsoft Research Asia, China
Benyu Zhang , Microsoft Research Asia, China
Dou Shen , Hong Kong University of Science and Technology, Hong Kong
pp. 938-942

Gradual Cube: Customize Profile on Mobile OLAP (Abstract)

Haofeng Zhou , Fudan University, China
Jun Li , Fudan University, China
Wei Wang , Fudan University, China
pp. 943-947

CoMiner: An Effective Algorithm for Mining Competitors from the Web (Abstract)

Rui Li , Shanghai Jiao Tong University, China
Yunbo Cao , Microsoft Research Asia, China
Jin Wang , Shanghai Jiao Tong University, China
Yong Yu , Shanghai Jiao Tong University, China
Shenghua Bao , Shanghai Jiao Tong University, China
pp. 948-952

Multi-Tier Granule Mining for Representations of Multidimensional Association Rules (Abstract)

Yue Xu , Queensland University of Technology, Australia
Yuefeng Li , Queensland University of Technology, Australia
Wanzhong Yang , Queensland University of Technology, Australia
pp. 953-958

Social Capital in Friendship-Event Networks (Abstract)

Louis Licamele , University of Maryland, USA
Lise Getoor , University of Maryland, USA
pp. 959-964

Exploratory Under-Sampling for Class-Imbalance Learning (Abstract)

Xu-Ying Liu , Nanjing University, China
Jianxin Wu , Georgia Institute of Technology, USA
Zhi-Hua Zhou , Nanjing University, China
pp. 965-969

The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study (Abstract)

Xu-Ying Liu , Nanjing University, China
Zhi-Hua Zhou , Nanjing University, China
pp. 970-974

Similarity of Temporal Query Logs Based on ARIMA Model (Abstract)

Ying Li , Microsoft AdCenter, USA
Ning Liu , Microsoft Research Asia, China
Shuzhen Nong , Microsoft AdCenter, USA
Benyu Zhang , Microsoft Research Asia, China
Zheng Chen , Microsoft Research Asia, China
Jun Yan , Microsoft Research Asia, China
pp. 975-979

Probabilistic Segmentation and Analysis of Horizontal Cells (Abstract)

Ambuj K. Singh , University of California, Santa Barbara, USA
Vebjorn Ljosa , University of California, Santa Barbara, USA
pp. 980-985

Mining Correlation between Motifs and Gene Expression (Abstract)

Adrian E. Platts , Wayne State University
Stephen A. Krawetz , Wayne State University
Yi Lu , Wayne State University
Shiyong Lu , Wayne State University
pp. 986-990

On Trajectory Representation for Scientific Features (Abstract)

Sameep Mehta , India Research Labs, IBM, India
Srinivasan Parthasarathy , Ohio State University, USA
Raghu Machiraju , Ohio State University, USA
pp. 997-1001

NewsCATS: A News Categorization and Trading System (Abstract)

Marc-Andre Mittermayer , Swiss Capital Group, Switzerland
Gerhard F. Knolmayer , Univ. of Bern, Switzerland
pp. 1002-1007

Improving Grouped-Entity Resolution Using Quasi-Cliques (Abstract)

Jian Pei , Simon Fraser Univ., Canada
Ergin Elmacioglu , The Pennsylvania State University, USA
Jaewoo Kang , NCSU & Korea Univ., Korea
Dongwon Lee , The Pennsylvania State University, USA
Byung-Won On , The Pennsylvania State University, USA
pp. 1008-1015

Fast Relevance Discovery in Time Series (Abstract)

Haixun Wang , IBM Research
Sheng Ma , movivi.com
Chang-shing Perng , IBM Research
pp. 1016-1020

Probabilistic Enhanced Mapping with the Generative Tabular Model (Abstract)

Mohamed Nadif , Universite Paris Descartes, France
Rodolphe Priam , Universite de Poitiers, France
pp. 1021-1025

Object Identification with Constraints (Abstract)

Steffen Rendle , University of Freiburg, Germany
Lars Schmidt-Thieme , University of Freiburg, Germany
pp. 1026-1031

High-Performance Unsupervised Relation Extraction from Large Corpora (Abstract)

Binjamin Rozenfeld , Bar-Ilan University, Ramat Gan
Ronen Feldman , Bar-Ilan University, Ramat Gan
pp. 1032-1037

Cluster Based Core Vector Machine (Abstract)

M. Narasimha Murty , Indian Institute of Science, India
Asharaf S , Indian Institute of Science, India
S.K. Shevade , Indian Institute of Science, India
pp. 1038-1042

Enhancing Text Clustering Using Concept-based Mining Model (Abstract)

Shady Shehata , University of Waterloo, Canada
Mohamed Kamel , University of Waterloo, Canada
Fakhri Karray , University of Waterloo, Canada
pp. 1043-1048

Detecting Link Spam Using Temporal Information (Abstract)

Bin Gao , Microsoft Research Asia, China
Guang Feng , Microsoft Research Asia, China; Tsinghua University, China
Hang Li , Microsoft Research Asia, China
Shiji Song , Tsinghua University, China
Guoyang Shen , Microsoft Research Asia, China; Tsinghua University, China
Tie-Yan Liu , Microsoft Research Asia, China
pp. 1049-1053

Minimum Enclosing Spheres Formulations for Support Vector Ordinal Regression (Abstract)

S.K. Shevade , Indian Institute of Science, India
Wei Chu , Columbia University, USA
pp. 1054-1058

Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment (Abstract)

Kelvin Sim , Institute for Infocomm Research, Singapore
Jinyan Li , Institute for Infocomm Research, Singapore
Vivekanand Gopalkrishnan , Nanyang Technological University, Singapore
Guimei Liu , National University of Singapore, Singapore
pp. 1059-1063

Boosting the Feature Space: Text Classification for Unstructured Data on the Web (Abstract)

Yang Song , The Pennsylvania State University, USA
Hongyuan Zha , The Pennsylvania State University, USA
Ding Zhou , The Pennsylvania State University, USA
Jian Huang , The Pennsylvania State University, USA
C. Lee Giles , The Pennsylvania State University, USA
Isaac G. Councill , The Pennsylvania State University, USA
pp. 1064-1069

Plagiarism Detection in arXiv (Abstract)

Simeon Warner , Cornell University, USA
Johannes Gehrke , Cornell University, USA
Paul Ginsparg , Cornell University, USA
Daria Sorokina , Cornell University, USA
pp. 1070-1075

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams (Abstract)

Jimeng Sun , Carnegie Mellon University, USA
Spiros Papadimitriou , IBM T.J. Watson Research Center, USA
Philip S. Yu , IBM T.J. Watson Research Center, USA
pp. 1076-1080

Automatic Single-Organ Segmentation in Computed Tomography Images (Abstract)

Daniela Raicu , DePaul University, USA
Ruchaneewan Susomboon , DePaul University, USA
David Channin , Northwestern University, USA
Jacob Furst , DePaul University, USA
pp. 1081-1086

Improving Nearest Neighbor Classifier Using Tabu Search and Ensemble Distance Metrics (Abstract)

Muhammad Atif Tahir , University of the West of England, UK
James Smith , University of the West of England, UK
pp. 1086-1090

Comparisons of K-Anonymization and Randomization Schemes under Linking Attacks (Abstract)

Wenliang Du , Syracuse University, USA
Zhouxuan Teng , Syracuse University, USA
pp. 1091-1096

MARGIN: Maximal Frequent Subgraph Mining (Abstract)

Lini T Thomas , IIIT, India
Kamalakar Karlapalem , IIIT, India
pp. 1097-1101

Resource Management for Networked Classifiers in Distributed Stream Mining Systems (Abstract)

Upendra V. Chaudhari , IBM T.J. Watson Research Center, USA
Lisa D. Amini , IBM T.J. Watson Research Center, USA
Deepak S. Turaga , IBM T.J. Watson Research Center, USA
Olivier Verscheure , IBM T.J. Watson Research Center, USA
pp. 1102-1107

Entropy-based Concept Shift Detection (Abstract)

Peter Vorburger , University of Zurich, Switzerland
Abraham Bernstein , University of Zurich, Switzerland
pp. 1113-1118

Recommendation on Item Graphs (Abstract)

Fei Wang , Tsinghua University, China
Tao Li , Florida International University, USA
Sheng Ma , Vivido Media (Beijing) Inc., China
Liuzhong Yang , Vivido Media (Beijing) Inc., China
pp. 1119-1123

Solution Path for Semi-Supervised Classification with Manifold Regularization (Abstract)

Tao Chen , The Hong Kong University of Science and Technology, China
Frederick H. Lochovsky , The Hong Kong University of Science and Technology, China
Gang Wang , The Hong Kong University of Science and Technology, China
Dit-Yan Yeung , The Hong Kong University of Science and Technology, China
pp. 1124-1129

Semi-Supervised Kernel Regression (Abstract)

Xian-Sheng Hua , Microsoft Research Asia, China
Meng Wang , University of Science and Technology of China, China
Hong-Jiang Zhang , Microsoft Research Asia, China
Yan Song , University of Science and Technology of China, China
Li-Rong Dai , University of Science and Technology of China, China
pp. 1130-1135

Mining Complex Time-Series Data by Learning Markovian Models (Abstract)

Yi Wang , Tsinghua University, China
Lizhu Zhou , Tsinghua University, China
Jianhua Feng , Tsinghua University, China
Zhi-Qiang Liu , City University of Hong Kong, Hong Kong
Jianyong Wang , Tsinghua University, China
pp. 1136-1140

Temporal Data Mining in Dynamic Feature Spaces (Abstract)

Brent Wenerstrom , Sharp Analytics, USA
Christophe Giraud-Carrier , Brigham Young University, USA
pp. 1141-1145

Discover Bayesian Networks from Incomplete Data Using a Hybrid Evolutionary Algorithm (Abstract)

Man Leung Wong , Lingnan University, Hong Kong
Yuan Yuan Guo , Lingnan University, Hong Kong
pp. 1146-1150

Distances and (Indefinite) Kernels for Sets of Objects (Abstract)

Adam Woznica , University of Geneva, Switzerland
Alexandros Kalousis , University of Geneva, Switzerland
Melanie Hilario , University of Geneva, Switzerland
pp. 1151-1156

Deploying Approaches for Pattern Refinement in Text Mining (Abstract)

Yuefeng Li , Queensland University of Technology, Australia
Yue Xu , Queensland University of Technology, Australia
Sheng-Tang Wu , Queensland University of Technology, Australia
pp. 1157-1161

TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases (Abstract)

Mark Brodie , IBM TJ Watson
Sheng Ma , Vivido Media Inc.
Hui Xiong , Rutgers University, USA
pp. 1162-1166

Manifold Clustering of Shapes (Abstract)

Dragomir Yankov , University of California, Riverside, USA
Eamonn Keogh , University of California, Riverside, USA
pp. 1167-1171

Adaptive Kernel Principal Component Analysis with Unsupervised Learning of Kernels (Abstract)

Songcan Chen , NUAA, China
Zhi-Hua Zhou , Nanjing University, China
Daoqiang Zhang , Nanjing University, China
pp. 1178-1182

Rule-Based Platform for Web User Profiling (Abstract)

Manu Shukla , AOL, LLC, USA
Jianping Zhang , AOL, LLC, USA
pp. 1183-1187

Opening the Black Box of Feature Extraction: Incorporating Visualization into High-Dimensional Data Mining Processes (Abstract)

Jianting Zhang , The University of New Mexico, USA
Le Gruenwald , University of Oklahoma, USA; National Science Foundation, USA
pp. 1188-1192

Semantic Smoothing for Model-based Document Clustering (Abstract)

Xiaohua Hu , Drexel University
Xiaodan Zhang , Drexel University
Xiaohua Zhou , Drexel University
pp. 1193-1198

Corrective Classification: Classifier Ensembling with Corrective and Diverse Base Learners (Abstract)

Xindong Wu , University of Vermont, USA
Xingquan Zhu , Florida Atlantic University, USA
Yan Zhang , University of Vermont, USA
pp. 1199-1204

Speedup Clustering with Hierarchical Ranking (Abstract)

Joerg Sander , University of Alberta, Canada
Jianjun Zhou , University of Alberta, Canada
pp. 1205-1210

Query-Sensitive Similarity Measure for Content-Based Image Retrieval (Abstract)

Zhi-Hua Zhou , Nanjing University, China
Hong-Bin Dai , Nanjing University, China
pp. 1211-1215
Author Index

Author Index (PDF)

pp. 1216
103 ms
(Ver )