|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Ruoming Jin, Ge Yang, Gagan Agrawal, "Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 1, pp. 71-89, January, 2005. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2005.18, author = {Ruoming Jin and Ge Yang and Gagan Agrawal}, title = {Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {17}, number = {1}, issn = {1041-4347}, year = {2005}, pages = {71-89}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2005.18}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance IS - 1 SN - 1041-4347 SP71 EP89 EPD - 71-89 A1 - Ruoming Jin, A1 - Ge Yang, A1 - Gagan Agrawal, PY - 2005 KW - Shared memory parallelization KW - programming interfaces KW - association mining KW - clustering KW - decision tree construction. VL - 17 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
[1] R. Agrawal, T. Imielinski, and A. Swami, “Database Mining: A Performance Perspective,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 6, pp. 914-925, Dec. 1993.
[2] R. Agrawal and J. Shafer, “Parallel Mining of Association Rules,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 962-969, June 1996.
[3] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 1994 Int'l Conf. Very Large Databases (VLDB '94), pp. 487-499, Sept. 1994.
[4] K. Alsabti, S. Ranka, and V. Singh, “Clouds: Classification for Large or Out-of-Core Datasets,” http://www.cise.ufl.edu/rankadm.html, 1998.
[5] P. Becuzzi, M. Coppola, and M. Vanneschi, “Mining of Association Rules in Very Large Databases: A Structured Parallel Approach,” Proc. Europar-99, vol. 1685, pp. 1441-1450, Aug. 1999.
[6] W. Blume, R. Doallo, R. Eigenman, J. Grout, J. Hoelflinger, T. Lawrence, J. Lee, D. Padua, Y. Paek, B. Pottenger, L. Rauchwerger, and P. Tu, “Parallel Programming with Polaris,” Computer, vol. 29, no. 12, pp. 78-82, Dec. 1996.
[7] R.D. Blumofe, C.F. Joerg, et al., “Cilk: An Efficient Multithreaded Runtime System,” , Proc. Fifth ACM Conf. Principles and Practices of Parallel Programming (PPoPP), 1995.
[8] S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” Proc. ACM SIGMOD Conf. Management of Data, May 1997.
[9] P. Cheeseman and J. Stutz, “Bayesian Classification (Autoclass): Theory and Practice,” Advanced in Knowledge Discovery and Data Mining, pp. 61-83, 1996.
[10] I.S. Dhillon and D.S. Modha, “A Data-Clustering Algorithm on Distributed Memory Multiprocessors,” Proc. Workshop Large-Scale Parallel KDD Systems, in conjunction with the Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp. 47-56, Aug. 1999.
[11] R. Ferreira, G. Agrawal, and J. Saltz, “Compiling Object-Oriented Data Intensive Computations,” Proc. 2000 Int'l Conf. Supercomputing, May 2000.
[12] G. Forman and B. Zhang, “Distributed Data Clustering Can be Efficient and Exact,” Proc. SIGKDD Explorations, vol. 2, no. 2, Dec. 2000.
[13] J. Gehrke, V. Ganti, R. Ramakrishnan, and W. Loh, “Boat— Optimistic Decision Tree Construction,” Proc. ACM SIGMOD Conf. Management of Data, June 1999.
[14] J. Gehrke, R. Ramakrishnan, and V. Ganti, “Rainforest— A Framework for Fast Decision Tree Construction of Large Datasets,” Proc. Conf. Very Large Databases (VLDB), 1998.
[15] S. Goil and A. Choudhary, “Efficient Parallel Classification Using Dimensional Aggregates,” Proc. Workshop Large-Scala Parallel KDD Systems, with ACM SIGKDD-99, Aug. 1999.
[16] S. Goil and A. Choudhary, “PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining,” J. Parallel and Distributed Computing, vol. 61, no. 3, pp. 285-321, Mar. 2001.
[17] E. Gutierrez, O. Plata, and E.L. Zapata, “A Compiler Method for the Parallel Execution of Irregular Reductions in Scalable Shared Memory Multiprocessors,” Proc. Int'l Conf. Supercomputing (ICS '00), pp. 78-87, May 2000.
[18] M. Hall, S. Amarsinghe, B. Murphy, S. Liao, and M. Lam, “Maximizing Multiprocessor Performance with the SUIF Compiler,” Computer, no. 12, Dec. 1996.
[19] E.-H. Han, G. Karypis, and V. Kumar, “Scalable Parallel Datamining for Association Rules,” Proc. ACM SIGMOD 1997, May 1997.
[20] E-H. Han, G. Karypis, and V. Kumar, “Scalable Parallel Datamining for Association Rules,” IEEE Trans. Data and Knowledge Eng., vol. 12, no. 3, May/June 2000.
[21] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. ACM SIGMOD Conf. Management of Data, 2000.
[22] J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.
[23] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, 2nd ed. Morgan Kaufmann, Inc., 1996.
[24] IBM. Db2, “Universal Database Goes Parallel with Enterprise and Enterprise-Extended Editions,” http://www-4.ibm.com/software/data/db2/udb 98eeebrochure, 1999.
[25] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Prentice Hall, 1988.
[26] R. Jin and G. Agrawal, “An Efficient Implementation of Apriori Association Mining on Cluster of SMPs,” Proc. Workshop High Performance Data Mining (IPDPS 2001), Apr. 2001.
[27] R. Jin and G. Agrawal, “A Middleware for Developing Parallel Data Mining Implementations,” Proc. First SIAM Conf. Data Mining, Apr. 2001.
[28] R. Jin and G. Agrawal, “Performance Prediction for Random Write Reductions: A Case Study in Modeling Shared Memory Programs,” Proc. ACM SIGMETRICS, June 2002.
[29] M.V. Joshi, G. Karypis, and V. Kumar, “Scalparc: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets,” Proc. Int'l Parallel Processing Symp., 1998.
[30] A. Kagi, “Mechanisms for Efficient Shared-Memory, Lock-Based Sychronization,” PhD thesis, Univ. of Wisconsin, Madison, 1999.
[31] A. Kagi, D. Burger, and J.R. Goodman, “Efficient Synchronization: Let Them Eat QOLB,” Proc. 24th Ann. Int'l Symp. Computer Architecture, pp. 170-180, June 1997.
[32] X. Li, R. Jin, and G. Agrawal, “A Compilation Framework for Distributed Memory Parallelizattion of Data Mining Algorithms,” submitted for publication, 2002.
[33] X. Li, R. Jin, and G. Agrawal, “Compiler and Runtime Support for Shared Memory Parallelization of Data Mining Algorithms,” Proc. Conf. Language and Compilers for Parallel Computing, Aug. 2002.
[34] Y. Lin and D. Padua, “On the Automatic Parallelization of Sparse and Irregular Fortran Programs,” Proc. Workshop Languages, Compilers, and Runtime Systems for Scalable Computers (LCR-98), May 1998.
[35] H. Lu, A.L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel, “Compiler and Software Distributed Shared Memory Support for Irregular Applications,” Proc. Sixth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPOPP), pp. 48-56, June 1997.
[36] T. Mahapatra and S. Mishra, Oracle Parallel Processing. O'Reilly Publishers, 2000.
[37] M. Mehta, R. Agrawal, and J. Rissanen, “SLIQ: A Fast Scalable Classifier for Data Mining,” Proc. Fifth Int'l Conf. Extending Database Technology, 1996.
[38] A. Mueller, “Fast Sequential and Parallel Algorithms for Association Rule Mining: A Comparison,” Technical Report CS-TR-3515, Univ. of Maryland, College Park, Aug. 1995.
[39] S.K. Murthy, “Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey,” Data Mining and Knowledge Discovery, vol. 2, no. 4, pp. 345-389, 1998.
[40] G.J. Narlikar, “A Parallel, Multithreaded Decision Tree Builder,” Technical Report CMU-CS-98-184, School of Computer Science, Carnegie Mellon Univ., 1998.
[41] C.R. Palmer and C. Faloutsos, “Density Biases Sampling: An Improved Method for Data Mining and Clustering,” Proc. 2000 ACM SIGMOD Int'l Conf. Management of Data, June 2000.
[42] J.S. Park, M. Chen, and P.S. Yu, “An Effecitive Hash Based Algorithm for Mining Association Rules,” Proc. ACM SIGMOD Int'l Conf. Management of Data, May 1995.
[43] S. Parthasarathy, M. Zaki, and W. Li, “Memory Placement Techniques for Parallel Association Mining,” Proc. Fourth Int'l Conf. Knowledge Discovery and Data Mining (KDD), Aug. 1998.
[44] S. Parthasarathy, M. Zaki, M. Ogihara, and W. Li, “Parallel Data Mining for Association Rules on Shared-Memory Systems,” Knowledge and Information Systems, to appear, 2000.
[45] F. Provost and V. Kolluri, “A Survey of Methods for Scaling up Inductive Algorithms,” Knowledge Discovery and Data Mining, vol. 3, 1999.
[46] J.R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
[47] S. Ruggieri, “Efficient C4.5,” Technical Report TR-00-01, Dept. of Information, Univ. of Pisa, Feb. 1999.
[48] J.H. Saltz, R. Mirchandaney, and K. Crowley, “Run-Time Parallelization and Scheduling of Loops,” IEEE Trans. Computers, vol. 40, no. 5, pp. 603-612, May 1991.
[49] A. Savasere, E. Omiecinski, and S. Navathe, “An Efficient Algorithm for Mining Association Rules in Large Databases,” Proc. 21st Conf. Very Large Databases (VLDB), 1995.
[50] J. Shafer, R. Agrawal, and M. Mehta, “SPRINT: A Scalable Parallel Classifier for Data Mining,” Proc. 22nd Int'l Conf. Very Large Databases (VLDB), pp. 544-555, Sept. 1996.
[51] A. Shatdal, “Architectural Considerations for Parallel Query Evaluation Algorithms,” Technical Report CS-TR-1996-1321, Univ. of Wisconsin, 1999.
[52] D.B. Skillicorn, “Strategies for Parallel Data Mining,” IEEE Concurrency, Oct./Dec. 1999.
[53] A. Srivastava, E. Han, V. Kumar, and V. Singh, “Parallel Formulations of Decision-Tree Classification Algorithms,” Proc. 1998 Int'l Conf. Parallel Processing, 1998.
[54] H. Yu and L. Rauchwerger, “Adaptive Reduction Parallelization Techniques,” Proc. 2000 Int'l Conf. Supercomputing, pp. 66-75, May 2000.
[55] M.J. Zaki, C.-T. Ho, and R. Agrawal, “Parallel Classification for Data Mining on Shared-Memory Multiprocessors,” Proc. IEEE Int'l Conf. Data Eng., pp. 198-205, May 1999.
[56] M.J. Zaki, M. Ogihara, S. Parthasarathy, and W. Li, “Parallel Data Mining for Association Rules on Shared Memory Multiprocessors,” Proc. Conf. Supercomputing '96, Nov. 1996.
[57] M.J. Zaki, “Parallel and Distributed Association Mining: A Survey,” IEEE Concurrency, vol. 7, no. 4, pp. 14-25, 1999.

