This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Digging Deep into the Data Mine with DataMiningGrid
November/December 2008 (vol. 12 no. 6)
pp. 69-76
Vlado Stankovski, University of Ljubljana
Martin Swain, University of Ulster
Valentin Kravtsov, Technion?Israel Institute of Technology
Thomas Niessen, Scopevisio AG
Dennis Wegener, Fraunhofer Institute for Intelligent Analysis and Information Systems
Matthias R?, DaimlerChrysler
Jernej Trnkoczy, University of Ljubljana
Michael May, Fraunhofer Institute for Intelligent Analysis and Information Systems
J? Franke, DaimlerChrysler
Assaf Schuster, Technion?Israel Institute of Technology
Werner Dubitzky, University of Ulster
The growing computerization in modern knowledge and technology sectors is generating huge volumes of electronically stored data. Data mining technology is often employed to make sense of these data. However, as modern data mining applications increase in complexity, so do their demands for resources. Grid computing is one of several emerging networked computing paradigms promising to meet the requirements of heterogeneous, large-scale and distributed data mining applications. Despite this promise, there are still too many issues to be resolved before grid technology is commonly applied to large-scale data mining tasks. To address some of these issues, we developed the DataMiningGrid system, which principally differs from similar systems by its ability to integrate a diverse set of programs and application scenarios within a single framework. The system's key features include high performance and scalability, sophisticated support for relevant standards, different user types, and flexible extensibility. The software is available as open source.

1. I. Foster, C. Kesselman, and S. Tuecke, "The Anatomy of the Grid: Enabling Scalable Virtual Organizations," Int'l J. High Performance Computing Applications, vol. 15, no. 3, 2001, pp. 200–222.
2. A. Kumar, M.M. Kantardzic, and S. Madden, "Guest Editors' Introduction: Distributed Data Mining —Framework and Implementations," IEEE Internet Computing, vol. 10, no. 4, 2006, pp. 15–17.
3. V. Stankovski et al., "Grid-Enabling Data Mining Applications with DataMiningGrid: An Architectural Perspective," Future Generation Computing Systems, vol. 24, no. 4, 2008, pp. 259–279.
4. P. Plaszczak and J.R. Wellner, Grid Computing: The Savvy Manager's Guide, Morgan Kaufmann, 2006.
5. B. Sotomayor and L. Childers, Globus Toolkit 4: Programming Java Services, Morgan Kaufmann, 2006.
6. M. Antonioletti et al., "The Design and Implementation of Grid Database Services in OGSA-DAI," Concurrency and Computation: Practice and Experience, vol. 17, no. 2–4, 2005, pp. 357–376.
7. G. Von Laszewski et al., "A Java Commodity Grid Kit," Concurrency and Computation: Practice and Experience, vol. 13, nos. 8–9, 2001, pp. 645–662.
8. S. Venugopal, R. Buyya, and L. Winton, "A Grid Service Broker for Scheduling e-Science Applications on Global Data Grids," Concurrency and Computation: Practice and Experience, vol.18, no. 6, 2006, pp. 685–699.
9. G. Churches et al., "Programming Scientific and Distributed Workflow with Triana Services," Concurrency and Computation: Practice and Experience, vol. 18, no. 10, 2005, pp. 1021–1037.
10. C. Silva et al., "P-Found: The Protein Folding and Unfolding Simulation Repository," Proc. 2006 IEEE Symp. Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 06), IEEE Press, 2006, pp. 101–108.
1. P. Brezany, I. Janciak, and A.M. Tjoa, "GridMiner: A Fundamental Infrastructure for Building Intelligent Grid Systems," Proc. 2005 IEEE/WIC/ACM Int'l Conf. Web Intelligence (WI 05), IEEE Press, 2005, pp. 150–156.
2. A. Congiusta, D. Talia, and P. Trunfio, "Distributed Data Mining Services Leveraging WSRF," Future Generation Computer Systems, vol. 23, no. 1, 2007, pp. 34–41.
3. D. Guedes, W. Meira, and R. Ferreira, "Anteater: A Service-Oriented Architecture for High-Performance Data Mining," IEEE Internet Computing, vol. 10, no. 4, 2006, pp. 36–43.

Index Terms:
distributed architectures, distributed applications, data mining, middleware/business logic
Citation:
Vlado Stankovski, Martin Swain, Valentin Kravtsov, Thomas Niessen, Dennis Wegener, Matthias R?, Jernej Trnkoczy, Michael May, J? Franke, Assaf Schuster, Werner Dubitzky, "Digging Deep into the Data Mine with DataMiningGrid," IEEE Internet Computing, vol. 12, no. 6, pp. 69-76, Nov.-Dec. 2008, doi:10.1109/MIC.2008.122
Usage of this product signifies your acceptance of the Terms of Use.