
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Nan Zhang, "A Novel Parallel Scan for Multicore Processors and Its Application in Sparse MatrixVector Multiplication," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 3, pp. 397404, March, 2012.  
BibTex  x  
@article{ 10.1109/TPDS.2011.174, author = {Nan Zhang}, title = {A Novel Parallel Scan for Multicore Processors and Its Application in Sparse MatrixVector Multiplication}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {23}, number = {3}, issn = {10459219}, year = {2012}, pages = {397404}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2011.174}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  A Novel Parallel Scan for Multicore Processors and Its Application in Sparse MatrixVector Multiplication IS  3 SN  10459219 SP397 EP404 EPD  397404 A1  Nan Zhang, PY  2012 KW  Parallel algorithms KW  parallel scan KW  prefix sum KW  multicore computing KW  sparse matrixvector multiplication. VL  23 JA  IEEE Transactions on Parallel and Distributed Systems ER   
[1] N. Zhang, "A Novel Parallel Prefix Sum Algorithm and Its Implementation on MultiCore Platforms," Proc. Second Int'l Conf. Computer Eng. and Technology, vol. 2, pp. 6670, Apr. 2010.
[2] G.E. Blelloch, "Prefix Sums and Their Applications," Technical Report CMUCS90190, School of Computer Science, Carnegie Mellon Univ., http://www.cs.cmu.edu/~guyb/papersBle93. pdf , Nov. 1990.
[3] W.D. Hillis and G.L. SteeleJr, "Data Parallel Algorithms," Comm. ACM, vol. 29, no. 12, pp. 11701183, Dec. 1986.
[4] K.E. Iverson, A Programming Language. John Wiley & Sons, Inc, Dec. 1962.
[5] G.E. Blelloch, "Scans as Primitive Parallel Operations," IEEE Trans. Computers, vol. 38, no. 11, pp. 15261538, Nov. 1989.
[6] G.E. Blelloch, "NESL: A Nested DataParallel Language (Version 2.6)," Technical Report CMUCS93129, School of Computer Science, Carnegie Mellon Univ., 1993.
[7] W.D. Hillis, The Connection Machine. The MIT Press, 1985.
[8] G.E. Blelloch, J.C. Hardwick, J. Sipelstein, M. Zagha, and S. Chatterjee, "Implementation of a Portable Nested DataParallel Language," J. Parallel and Distributed Computing, vol. 21, no. 1, pp. 414, Apr. 1994.
[9] D. Horn, "Stream Reduction Operations for GPGPU Applications," GPU Gems 2, M. Pharr and R. Fernando, eds., ch. 36, pp. 573589, AddisonWesley Professional, 2005.
[10] S. Sengupta, A.E. Lefohn, and J.D. Owens, "A WorkEfficient StepEfficient Prefix Sum Algorithm," Proc. Workshop Edge Computing Using New Commodity Architectures, pp. D26D27, May 2006.
[11] M. Harris, S. Sengupta, and J.D. Owens, "Parallel Prefix Sum (Scan) with CUDA," GPU Gems 3, H. Nguyen, ed., ch. 39, AddisonWesley, Aug. 2007.
[12] G.E. Blelloch, M.A. Heroux, and M. Zagha, "Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors," Technical Report CMUCS93173, School of Computer Science, Carnegie Mellon Univ. and Cray Research, Inc., Aug. 1993.
[13] S. Sengupta, M. Harris, Y. Zhang, and J.D. Owens, "Scan Primitives for GPU Computing," Proc. 22nd ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, pp. 97106, 2007.
[14] Intel 64 and IA32 Architectures Optimization Reference Manual, Intel Corporation, 248966018, Mar. 2009.
[15] R.K. Malladi, "Using Intel VTune Performance Analyzer Events/Ratios and Optimizing Applications," http:/software.intel.com, Jan. 2009.
[16] R.E. Bryant and D.R. O'Hallaron, Computer Systems: A Programmer's Perspective, ch. 9, p. 671. Prentice Hall, 2002.
[17] J.T. Schwartz, "UltraComputers," ACM Trans. Programming Languages and Systems, vol. 2, no. 4, pp. 484521, Oct. 1980.
[18] S. Sengupta, M. Harris, and M. Garland, "Efficient Parallel Scan Algorithms for GPUs," Technical report, NVIDIA Corporation, Dec. 2008.
[19] S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel, "Optimization of Sparse MatrixVector Multiplication on Emerging Multicore Platforms," Parallel Computing, vol. 35, no. 3, pp. 178194, Mar. 2009.
[20] M. Krotkiewski and M. Dabrowski, "Parallel Symmetric Sparse MatrixVector Product on Scalar MultiCore CPUs," Parallel Computing, vol. 36, no. 4, pp. 181198, Apr. 2010.
[21] R. Vuduc1, J.W. Demmel, and K.A. Yelick, "OSKI: A Library of Automatically Tuned Sparse Matrix Kernels," J. Physics: Conf. Series, vol. 16, pp. 521530, 2005.
[22] T.A. Davis and Y. Hu, "The University of Florida Sparse Matrix Collection," Submitted to ACM Trans. Math. Software. http://www.cise.ufl.edu/~davis/techreports matrices.pdf, 2010.