|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Gwo Giun (Chris) Lee, He-Yuan Lin, Chun-Fu Chen, Tsung-Yuan Huang, "Quantifying Intrinsic Parallelism Using Linear Algebra for Algorithm/Architecture Coexploration," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 5, pp. 944-957, May, 2012. | |||
| BibTex | x | ||
| @article{ 10.1109/TPDS.2011.230, author = {Gwo Giun (Chris) Lee and He-Yuan Lin and Chun-Fu Chen and Tsung-Yuan Huang}, title = {Quantifying Intrinsic Parallelism Using Linear Algebra for Algorithm/Architecture Coexploration}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {23}, number = {5}, issn = {1045-9219}, year = {2012}, pages = {944-957}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2011.230}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Quantifying Intrinsic Parallelism Using Linear Algebra for Algorithm/Architecture Coexploration IS - 5 SN - 1045-9219 SP944 EP957 EPD - 944-957 A1 - Gwo Giun (Chris) Lee, A1 - He-Yuan Lin, A1 - Chun-Fu Chen, A1 - Tsung-Yuan Huang, PY - 2012 KW - Intrinsic parallelism KW - linear algebra KW - algorithm/architecture coexploration. VL - 23 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
[1] G. Tan, N. Sun, and G.R. Gao, "Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 20, no. 2, pp. 261-274, Feb. 2009.
[2] E. Seo, J. Jeong, S. Park, and J. Lee, "Energy Efficient Scheduling of Real-Time Tasks on Multicore Processors," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 11, pp. 1540-1552, Nov. 2008.
[3] G.G. Lee, Y.K. Chen, M. Mattavelli, and E.S. Jang, "Algorithm/Architecture Co-Exploration of Visual Computing: Overview and Future Perspectives," IEEE Trans. Circuits and Systems for Video Technology, vol. 19, no. 11, pp. 1576-1587, Nov. 2009.
[4] G.G. Lee, M.J. Wang, H.Y. Lin Drew, W.C. Su, and B.Y. Lin, "Algorithm/Architecture Co-Design of 3D Spatio-Temporal Motion Estimation for Video Coding," IEEE Trans. Multimedia, vol. 9, no. 3, pp. 455-465, Apr. 2007.
[5] G.M. Amdahl, "Validity of Single-Processor Approach to Achieving Large-Scale Computing Capability," Proc. Spring Joint Computer Conf. (AFIPS), pp. 483-485, 1967.
[6] A. Prihozhy, M. Mattavelli, and D. Mlynek, "Evaluation of the Parallelization Potential for Efficient Multimedia Implementations: Dynamic Evaluation of Algorithm Critical Path," IEEE Trans. Circuits and Systems for Video Technology, vol. 15, no. 5, pp. 593-608, May 2005.
[7] H.-Y. Lin and G.G. Lee, "Quantifying Intrinsic Parallelism via Eigen-Decomposition of Dataflow Graphs for Algorithm/Architecture Co-Exploration," Proc. IEEE Workshop Signal Processing Systems (SIPS), pp. 317-328, Oct. 2010.
[8] S.Y. Kung, VLSI Array Processor. Prentice Hall, 1988.
[9] K. Högstedt, L. Carter, and J. Ferrante, "On the Parallel Execution of Tiled Loops," IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 3, pp. 307-321, Mar. 2003.
[10] M. Kandemir, A. Choudhary, N. Shenoy, P. Banerjee, and J. Ramanujam, "A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts," IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 2, pp. 115-135, Feb. 1999.
[11] J.W. Janneck, D. Miller, and D.B. Parlour, "Profiling Dataflow Programs," Proc. IEEE Int'l Conf. Multimedia and Expo (ICME '08), pp. 1065-1068, June 2008.
[12] B. Hendrickson and R. Leland, "An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations," Technical Report SAND92-1460, Sandia Nat'l Laboratories, 1992.
[13] A. Pothen, H.D. Simon, and K.P. Liou, "Partitioning Sparse Matrices with Eigenvectors of Graphs," SIAM J. Matrix Analytical Applications, vol. 11, pp. 430-452, 1990.
[14] P. Lenders and J. Xue, "Eigenvectors-Based Parallelisation of Nested Loops with Affine Dependences," Proc. Int'l Conf. Algorithms and Architectures for Parallel Processing (ICAPP '97), pp. 357-366, Dec. 1997.
[15] T. Grotker, S. Liao, G. Martin, and S. Swan, System Design with SystemC. Springer, 2002.
[16] L.-F. Chao and E.H.-M. Sha, "Scheduling Data-Flow Graphs via Retiming and Unfolding," IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 12, pp. 1259-1267, Dec. 1997.
[17] A.V. Oppenheim and R.W. Schaefer, Discrete-Time Signal Processing. Prentice-Hall, 1989.
[18] S. Edeards, L. Lavagno, E.A. Lee, and A. Sangiovanni-Vincentelli, "Design of Embedded Systems: Formal Models, Validation and Synthesis," Proc. IEEE, vol. 85, no. 3, pp. 366-390, Mar. 1997.
[19] F.R.K. Chung, Spectral Graph Theory (Regional Conferences Series in Mathematics), no. 92. AMS Bookstore 1997.
[20] M. Fiedler, "Algebraic Connectivity of Graphs," Czechoslovak Math. J., vol. 23, no. 2, pp. 298-305, 1973.
[21] B. Mohar, The Laplacian Spectrum of Graphs, Y. Alavi, G. Chartrand, O. Ollermann, and A. Schwenk, eds., pp. 871-898. Wiley, 1991.
[22] E.K.P. Chong and S.H. Żak, An Introduction to Optimization, third ed. John Wiley & Sons, 2008.
[23] D.C. Lay, Linear Algebra and Its Applications, third ed. Addison Wesley, 2003.
[24] L. Zhuo and V.K. Prasanna, "Scalable and Modular Algorithms for Floating-Point Matrix Multiplication of Reconfigurable Computing Systems," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 5, pp. 666-681, May 2008.
[25] W. Sweldens, "The Lifting Scheme: A Custom-Design Construction of Biorthogonal Wavelets," Applied and Computational Harmonic Analysis, vol. 3, pp. 186-200, 1996.
[26] Intel Compiler http://software.intel.com/en-usintel-compilers /, 2011.
[27] Target http:/www.retarget.com/, 2011.
[28] G.G. Lee, H.-Y. Lin, D.W.-C. Su, and M.-J. Wang, "Multiresolution-Based Texture Adaptive Algorithm for High-Quality Deinterlacing," IEICE - Trans. Information and System, vol. E90-D, no. 11, pp. 1821-1830, Nov. 2007.
[29] M. Monchiero, R. Canal, and A. González, "Power/Performance/Thermal Design-Space Exploration for Multicore Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 5, pp. 666-681, May 2008.
[30] R.W. Farebrother, Linear Least Squares Computations. Marcel Dekker, 1988.

