2016 IEEE 32nd International Conference on Data Engineering (ICDE) (2016)
May 16, 2016 to May 20, 2016
Michael Cafarella , University of Michigan, USA
Ihab F. Ilyas , University of Waterloo, USA
Marcel Kornacker , Cloudera, USA
Tim Kraska , Brown University, USA
Christopher Re , Stanford University, USA
With the increasing urge of the enterprises to ingest as much data as they can in what's commonly referred to as "Data Lakes", the new environment presents serious challenges to traditional ETL models and to building analytic layers on top of well-understood global schema. With the recent development of multiple technologies to support this "load-first" paradigm, even traditional enterprises have fairly large HDFS-based data lakes now. They have even had them long enough that their first generation IT projects delivered on some, but not all, of the promise of integrating their enterprise's data assets. In short, we moved from no data to Dark data. Dark data is what enterprises might have in their possession, without the ability to access it or with limited awareness of what this data represents. In particular, business-critical information might still remain out of reach. This panel is about Dark Data and whether we have been focusing on the right data management challenges in dealing with it.
Big data, Data mining, Lakes, Cleaning, Databases, Computer science, Information retrieval
M. Cafarella, I. F. Ilyas, M. Kornacker, T. Kraska and C. Re, "Dark Data: Are we solving the right problems?," 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 2016, pp. 1444-1445.