ICDM 2019 Ignites Data-Mining Innovations That are Driving Multi-Billion Dollar AI Market

By Lori Cameron

By Lori Cameron on

September 24, 2019

One of the most ambitious forecasts to date has been released on the future of artificial intelligence, in which experts say the global market is expected to top $77.6 billion by 2022.

Data mining is the driving force behind this tech.

Industry leaders can stay well-informed of critical developments in data mining tech at ICDM 2019, the world’s premier research conference on data mining.

The IEEE International Conference on Data Mining (ICDM) provides an international forum for the presentation of original research results, as well as innovations and practical development. This year's conference will be held in Beijing, China, 8-11 November 2019.

How Data Mining Compensates for a Shortage of Analysts

The benefits of data mining are numerous for a wide range of business tasks—customer segmentation, manufacturing, fraud detection, patient monitoring, banking, surveillance, criminal investigations, and bioinformatics, for example.

Many businesses suffer from a widening gap between exploding data and an ever-shrinking supply of analysts. Data mining provides the most efficient technical solution to date.

ICDM 2019 will draw researchers, application developers, and practitioners from a wide range of data mining related areas such as statistics, machine learning, pattern recognition, databases, data warehousing, data visualization, knowledge-based systems, and high-performance computing. By promoting novel, high-quality research findings, and innovative solutions to challenging data mining problems, the conference seeks to advance the state-of-the-art in data mining.

ICDM 2019 Topics of Interest

ICDM 2019 will focus on emerging topics of high importance such as data quality, time-evolving networks, big data mining and analytics, cyber-physical systems, and heterogeneous data integration and mining:

Foundations, algorithms, models, and theory of data mining, including big data mining.

Machine learning and statistical methods for data mining.

Mining from heterogeneous data sources, including text, semi-structured, spatio-temporal, streaming, graph, web, and multimedia data.

Data mining systems and platforms, and their efficiency, scalability, security, and privacy.

Data mining for modelling, visualization, personalization, and recommendation.

Data mining for cyber-physical systems and complex, time-evolving networks.

Applications of data mining in social sciences, physical sciences, engineering, life sciences, web, marketing, finance, precision medicine, health informatics, and other domains.

Meet Our ICDM 2019 Keynote Speakers

In “Applying theory of data to practice,” Ronald Fagin of IBM will talk about applying theory to practice, with a focus on two IBM case studies. In the first case study, there is a set of “voters” and a set of “candidates”, where each voter assigns a numerical score to each candidate. There is a scoring function (such as the mean or median), and a consensus ranking is obtained by applying the scoring function to each candidate’s scores. The problem is to find the top k candidates while minimizing the number of database accesses. Fagin will present an algorithm that is optimal in an extremely strong sense: not just in the worst case or the average case, but in every case. Even though the algorithm is only ten lines long, the paper containing the algorithm won the 2014 Gödel Prize, the top prize for a paper in theoretical computer science. The talk is aimed at both theoreticians and practitioners, to show them the mutual benefits of working together.

Ronald Fagin is an IBM Fellow at IBM Research ‐ Almaden. IBM Fellow is IBM’s highest technical honor. There are currently less than 100 active IBM Fellows (out of around 400,000 IBM employees worldwide), and there have been only around 250 IBM Fellows in the over 50-year history of the program.

In his talk “Actual Causality,” Joseph Halpern of Cornell will discuss the problem of defining actual causation as going beyond mere philosophical speculation. For example, in many legal arguments, it is precisely what needs to be established in order to determine responsibility. (What exactly was the actual cause of the car accident or the medical problem?) The philosophy literature has been struggling with the problem of defining causality since the days of Hume, in the 1700s. Many of the definitions have been couched in terms of counterfactuals. (C is a cause of E if, had C not happened, then E would not have happened.) In 2001, Judea Pearl and Halpern introduced a new definition of the actual cause, using Pearl’s notion of structural equations to model counterfactuals. The definition has been revised twice since then, extended to deal with notions like “responsibility” and “blame”, and applied in databases and program verification. Halpern surveys the last 15 years of work here, including joint work with Judea Pearl, Hana Chockler, and Chris Hitchcock.

Joseph Halpern joined the IBM Almaden Research Center in 1982, where he remained until 1996, also serving as a consulting professor at Stanford. In 1996, he joined the Computer Science Department at Cornell University, where he is currently the Joseph C. Ford Professor and was department chair 2010-14.

In his presentation “Embedding-Based Text Mining: A Frontier in Data Mining,” Jiawei Han of the University of Illinois at Urbana-Champaign talks about how real-world big data are largely unstructured, interconnected text data. It is highly desirable to mine such massive unstructured data and transform them into structures and knowledge. Since labor-intensive annotation and curation may not be scalable, it is highly desirable to explore a weak, distant, or self-supervised approach. Embedding has become a powerful tool to enable such a weak or self-supervised approach. In this talk, Han introduces a set of text-embedding approaches developed in recent year and demonstrate their power in text analysis and data mining tasks. Embedding-based text mining is a new frontier in data mining that may fundamentally transform the field.

Jiawei Han is Michael Aiken Chair Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign. His research interests include data mining, information network analysis, database systems, and data warehousing, with over 900 journal and conference publications.

The Venue - China National Convention Center (CNCC), Beijing

CNCC The China National Convention Center in Beijing was originally built to accommodate a portion of the 2008 Summer Olympic Games. The following year, it opened for its originally intended function—providing facilities for international conventions and exhibitions.

The Great Wall of China

The CNCC, a stately edifice covering 530,000 square meters, towers over Beijing’s northern horizon, along with landmarks like the Bird’s Nest, or Beijing National Stadium. Having served as a competition spot for fencing and for the shooting and fencing events in the modern pentathlon in the 2008 Beijing Olympics, the CNCC underwent a year-long retrofit that transformed the sportswear facility into a conference venue.

Since completion of renovations in November 2009, CNCC has hosted a number of high-profile international events, underlining its prestige domestically and globally in the convention and exhibition industry.

If you are inclined to see the sites, the Great Wall of China and the Forbidden City are both only a short bus ride away.

About Lori Cameron

Lori Cameron is Senior Writer for IEEE Computer Society publications and digital media platforms with over 20 years technical writing experience. She is a part-time English professor and winner of two 2018 LA Press Club Awards. Contact her at l.cameron@computer.org. Follow her on LinkedIn.