
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Ruoming Jin, Ge Yang, Gagan Agrawal, "Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 1, pp. 7189, January, 2005.  
BibTex  x  
@article{ 10.1109/TKDE.2005.18, author = {Ruoming Jin and Ge Yang and Gagan Agrawal}, title = {Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {17}, number = {1}, issn = {10414347}, year = {2005}, pages = {7189}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2005.18}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance IS  1 SN  10414347 SP71 EP89 EPD  7189 A1  Ruoming Jin, A1  Ge Yang, A1  Gagan Agrawal, PY  2005 KW  Shared memory parallelization KW  programming interfaces KW  association mining KW  clustering KW  decision tree construction. VL  17 JA  IEEE Transactions on Knowledge and Data Engineering ER   
[1] R. Agrawal, T. Imielinski, and A. Swami, “Database Mining: A Performance Perspective,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 6, pp. 914925, Dec. 1993.
[2] R. Agrawal and J. Shafer, “Parallel Mining of Association Rules,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 962969, June 1996.
[3] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 1994 Int'l Conf. Very Large Databases (VLDB '94), pp. 487499, Sept. 1994.
[4] K. Alsabti, S. Ranka, and V. Singh, “Clouds: Classification for Large or OutofCore Datasets,” http://www.cise.ufl.edu/rankadm.html, 1998.
[5] P. Becuzzi, M. Coppola, and M. Vanneschi, “Mining of Association Rules in Very Large Databases: A Structured Parallel Approach,” Proc. Europar99, vol. 1685, pp. 14411450, Aug. 1999.
[6] W. Blume, R. Doallo, R. Eigenman, J. Grout, J. Hoelflinger, T. Lawrence, J. Lee, D. Padua, Y. Paek, B. Pottenger, L. Rauchwerger, and P. Tu, “Parallel Programming with Polaris,” Computer, vol. 29, no. 12, pp. 7882, Dec. 1996.
[7] R.D. Blumofe, C.F. Joerg, et al., “Cilk: An Efficient Multithreaded Runtime System,” , Proc. Fifth ACM Conf. Principles and Practices of Parallel Programming (PPoPP), 1995.
[8] S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” Proc. ACM SIGMOD Conf. Management of Data, May 1997.
[9] P. Cheeseman and J. Stutz, “Bayesian Classification (Autoclass): Theory and Practice,” Advanced in Knowledge Discovery and Data Mining, pp. 6183, 1996.
[10] I.S. Dhillon and D.S. Modha, “A DataClustering Algorithm on Distributed Memory Multiprocessors,” Proc. Workshop LargeScale Parallel KDD Systems, in conjunction with the Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp. 4756, Aug. 1999.
[11] R. Ferreira, G. Agrawal, and J. Saltz, “Compiling ObjectOriented Data Intensive Computations,” Proc. 2000 Int'l Conf. Supercomputing, May 2000.
[12] G. Forman and B. Zhang, “Distributed Data Clustering Can be Efficient and Exact,” Proc. SIGKDD Explorations, vol. 2, no. 2, Dec. 2000.
[13] J. Gehrke, V. Ganti, R. Ramakrishnan, and W. Loh, “Boat— Optimistic Decision Tree Construction,” Proc. ACM SIGMOD Conf. Management of Data, June 1999.
[14] J. Gehrke, R. Ramakrishnan, and V. Ganti, “Rainforest— A Framework for Fast Decision Tree Construction of Large Datasets,” Proc. Conf. Very Large Databases (VLDB), 1998.
[15] S. Goil and A. Choudhary, “Efficient Parallel Classification Using Dimensional Aggregates,” Proc. Workshop LargeScala Parallel KDD Systems, with ACM SIGKDD99, Aug. 1999.
[16] S. Goil and A. Choudhary, “PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining,” J. Parallel and Distributed Computing, vol. 61, no. 3, pp. 285321, Mar. 2001.
[17] E. Gutierrez, O. Plata, and E.L. Zapata, “A Compiler Method for the Parallel Execution of Irregular Reductions in Scalable Shared Memory Multiprocessors,” Proc. Int'l Conf. Supercomputing (ICS '00), pp. 7887, May 2000.
[18] M. Hall, S. Amarsinghe, B. Murphy, S. Liao, and M. Lam, “Maximizing Multiprocessor Performance with the SUIF Compiler,” Computer, no. 12, Dec. 1996.
[19] E.H. Han, G. Karypis, and V. Kumar, “Scalable Parallel Datamining for Association Rules,” Proc. ACM SIGMOD 1997, May 1997.
[20] EH. Han, G. Karypis, and V. Kumar, “Scalable Parallel Datamining for Association Rules,” IEEE Trans. Data and Knowledge Eng., vol. 12, no. 3, May/June 2000.
[21] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. ACM SIGMOD Conf. Management of Data, 2000.
[22] J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.
[23] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, 2nd ed. Morgan Kaufmann, Inc., 1996.
[24] IBM. Db2, “Universal Database Goes Parallel with Enterprise and EnterpriseExtended Editions,” http://www4.ibm.com/software/data/db2/udb 98eeebrochure, 1999.
[25] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Prentice Hall, 1988.
[26] R. Jin and G. Agrawal, “An Efficient Implementation of Apriori Association Mining on Cluster of SMPs,” Proc. Workshop High Performance Data Mining (IPDPS 2001), Apr. 2001.
[27] R. Jin and G. Agrawal, “A Middleware for Developing Parallel Data Mining Implementations,” Proc. First SIAM Conf. Data Mining, Apr. 2001.
[28] R. Jin and G. Agrawal, “Performance Prediction for Random Write Reductions: A Case Study in Modeling Shared Memory Programs,” Proc. ACM SIGMETRICS, June 2002.
[29] M.V. Joshi, G. Karypis, and V. Kumar, “Scalparc: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets,” Proc. Int'l Parallel Processing Symp., 1998.
[30] A. Kagi, “Mechanisms for Efficient SharedMemory, LockBased Sychronization,” PhD thesis, Univ. of Wisconsin, Madison, 1999.
[31] A. Kagi, D. Burger, and J.R. Goodman, “Efficient Synchronization: Let Them Eat QOLB,” Proc. 24th Ann. Int'l Symp. Computer Architecture, pp. 170180, June 1997.
[32] X. Li, R. Jin, and G. Agrawal, “A Compilation Framework for Distributed Memory Parallelizattion of Data Mining Algorithms,” submitted for publication, 2002.
[33] X. Li, R. Jin, and G. Agrawal, “Compiler and Runtime Support for Shared Memory Parallelization of Data Mining Algorithms,” Proc. Conf. Language and Compilers for Parallel Computing, Aug. 2002.
[34] Y. Lin and D. Padua, “On the Automatic Parallelization of Sparse and Irregular Fortran Programs,” Proc. Workshop Languages, Compilers, and Runtime Systems for Scalable Computers (LCR98), May 1998.
[35] H. Lu, A.L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel, “Compiler and Software Distributed Shared Memory Support for Irregular Applications,” Proc. Sixth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPOPP), pp. 4856, June 1997.
[36] T. Mahapatra and S. Mishra, Oracle Parallel Processing. O'Reilly Publishers, 2000.
[37] M. Mehta, R. Agrawal, and J. Rissanen, “SLIQ: A Fast Scalable Classifier for Data Mining,” Proc. Fifth Int'l Conf. Extending Database Technology, 1996.
[38] A. Mueller, “Fast Sequential and Parallel Algorithms for Association Rule Mining: A Comparison,” Technical Report CSTR3515, Univ. of Maryland, College Park, Aug. 1995.
[39] S.K. Murthy, “Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey,” Data Mining and Knowledge Discovery, vol. 2, no. 4, pp. 345389, 1998.
[40] G.J. Narlikar, “A Parallel, Multithreaded Decision Tree Builder,” Technical Report CMUCS98184, School of Computer Science, Carnegie Mellon Univ., 1998.
[41] C.R. Palmer and C. Faloutsos, “Density Biases Sampling: An Improved Method for Data Mining and Clustering,” Proc. 2000 ACM SIGMOD Int'l Conf. Management of Data, June 2000.
[42] J.S. Park, M. Chen, and P.S. Yu, “An Effecitive Hash Based Algorithm for Mining Association Rules,” Proc. ACM SIGMOD Int'l Conf. Management of Data, May 1995.
[43] S. Parthasarathy, M. Zaki, and W. Li, “Memory Placement Techniques for Parallel Association Mining,” Proc. Fourth Int'l Conf. Knowledge Discovery and Data Mining (KDD), Aug. 1998.
[44] S. Parthasarathy, M. Zaki, M. Ogihara, and W. Li, “Parallel Data Mining for Association Rules on SharedMemory Systems,” Knowledge and Information Systems, to appear, 2000.
[45] F. Provost and V. Kolluri, “A Survey of Methods for Scaling up Inductive Algorithms,” Knowledge Discovery and Data Mining, vol. 3, 1999.
[46] J.R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
[47] S. Ruggieri, “Efficient C4.5,” Technical Report TR0001, Dept. of Information, Univ. of Pisa, Feb. 1999.
[48] J.H. Saltz, R. Mirchandaney, and K. Crowley, “RunTime Parallelization and Scheduling of Loops,” IEEE Trans. Computers, vol. 40, no. 5, pp. 603612, May 1991.
[49] A. Savasere, E. Omiecinski, and S. Navathe, “An Efficient Algorithm for Mining Association Rules in Large Databases,” Proc. 21st Conf. Very Large Databases (VLDB), 1995.
[50] J. Shafer, R. Agrawal, and M. Mehta, “SPRINT: A Scalable Parallel Classifier for Data Mining,” Proc. 22nd Int'l Conf. Very Large Databases (VLDB), pp. 544555, Sept. 1996.
[51] A. Shatdal, “Architectural Considerations for Parallel Query Evaluation Algorithms,” Technical Report CSTR19961321, Univ. of Wisconsin, 1999.
[52] D.B. Skillicorn, “Strategies for Parallel Data Mining,” IEEE Concurrency, Oct./Dec. 1999.
[53] A. Srivastava, E. Han, V. Kumar, and V. Singh, “Parallel Formulations of DecisionTree Classification Algorithms,” Proc. 1998 Int'l Conf. Parallel Processing, 1998.
[54] H. Yu and L. Rauchwerger, “Adaptive Reduction Parallelization Techniques,” Proc. 2000 Int'l Conf. Supercomputing, pp. 6675, May 2000.
[55] M.J. Zaki, C.T. Ho, and R. Agrawal, “Parallel Classification for Data Mining on SharedMemory Multiprocessors,” Proc. IEEE Int'l Conf. Data Eng., pp. 198205, May 1999.
[56] M.J. Zaki, M. Ogihara, S. Parthasarathy, and W. Li, “Parallel Data Mining for Association Rules on Shared Memory Multiprocessors,” Proc. Conf. Supercomputing '96, Nov. 1996.
[57] M.J. Zaki, “Parallel and Distributed Association Mining: A Survey,” IEEE Concurrency, vol. 7, no. 4, pp. 1425, 1999.