1541-4922/04/$25.00 © 2004 IEEE
Published by the IEEE Computer Society
Guest Editors' Introduction
Laurence T. Yang , St. Francis Xavier University
Marcin Paprzycki , Oklahoma State University
Xiaohui Shen , Motorola
Xingfu Wu , Texas A&M University
| | Article Contents | |
| | Conclusion | |
| Download Citation | |
| | | |
| Download Content | |
| | | |
| | |
Data-intensive computing has become a key characteristic of modern large-scale scientific and engineering applications. For example, applications in the US Department of Energy's Accelerated Strategic Computing Initiative typically generate data in the range of hundreds of gigabytes to hundreds of terabytes. By 2005, detectors at the Large Hadron Collider at CERN, the European Laboratory for Particle Physics, will be producing several petabytes of data each year. The problem ASCI and similar applications face is that, with existing technologies, data sets the applications generate are much too large for effective storage, manipulation, archiving, navigation, visualization, or understanding.
What's worse is that our ability to compute, and thus generate, volumes of data is accelerating faster than our ability to affordably process and visualize the results. So, without substantial, more-than-evolutionary changes in technology, we won't be able to effectively explore massive data sets. Terascale or even petascale data-intensive computing requires new models and approaches to solve such storage problems. In addition, problem-solving in scientific and engineering applications requires collaboration and using distributed resources, further exacerbating the problem.
Over the coming months, this special issue will feature four peer-reviewed articles focusing on data storage and mining of data-intensive applications:
"Storage System Support for Multimedia Applications," by Pål Halvorsen, Carsten Griwodz, Ketil Lund, Vera Goebel, and Thomas Plagemann
"ODAM: An Optimized Distributed Association Mining Algorithm," by Mafruz Zaman Ashrafi, David Taniar, and Kate Smith
"Estimating Computation Times to Support of Data-Intensive Application Scheduling," by Shonali Krishnaswamy, Seng Wai Loke, and Arkady Zaslavsky
"Intelligent and Smart Management of Large-Scale Datasets Hold on Tertiary Storage Systems," by Bernd Reiner and Karl Hahn
We believe these articles and the topics they cover not only provide novel ideas, new results, and state-of-the-art techniques in this field, but also stimulate future research activities in data mining, management, I/O techniques, and hierarchical storage systems for large-scale, data-intensive applications.
This special issue is the result of the hard work of many others, such as authors and reviewers. We would like to express our sincere appreciation to the authors for their valuable contributions and to the external reviewers for their cooperation in completing the special issue under a tight schedule. Last, but not least, we thank Jean Bacon, editor in chief, for helping and encouraging our special issue in IEEE DS Online.
Laurence T. Yang is a professor at the Department of Computer Science at St. Francis Xavier University, Canada. His research interests include design and testing of embedded systems, high-performance computing and networking, wireless and mobile computing, and pervasive computing and communications. Contact him at lyang@stfx.ca.
Marcin Paprzycki is an assistant professor in the Computer Science Department at Oklahoma State University. His research interests include parallel computing and agent-based distributed systems. He received his PhD from Southern Methodist University. Contact him at the Computer Science Dept., Oklahoma State Univ., Tulsa, OK 74106; marcin@cs.okstate.edu.
Xiaohui Shen is a senior engineer in the Core Technology Department, Personal Communication Sector, at Motorola. He received his PhD from Northwestern University in 2001. His research interests include parallel and distributed computing, data management, I/O and storage systems, and Mobile Computing. Contact him at axs095@email.mot.com.
Xingfu Wu is a TEES research scientist in the computer science department at Texas A&M University. His research interests include database-driven performance analysis systems, parallel and grid computing, performance evaluation and modeling, and parallel programming environments and tools. He received his PhD in computer science from Beijing University of Aeronautics and Astronautics in 1997. Contact him at wuxf@cs.tamu.edu.