
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Frank J. Seinstra, Dennis Koelma, Andrew D. Bagdanov, "Finite State MachineBased Optimization of Data Parallel Regular Domain Problems Applied in LowLevel Image Processing," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 10, pp. 865877, October, 2004.  
BibTex  x  
@article{ 10.1109/TPDS.2004.55, author = {Frank J. Seinstra and Dennis Koelma and Andrew D. Bagdanov}, title = {Finite State MachineBased Optimization of Data Parallel Regular Domain Problems Applied in LowLevel Image Processing}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {15}, number = {10}, issn = {10459219}, year = {2004}, pages = {865877}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2004.55}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  Finite State MachineBased Optimization of Data Parallel Regular Domain Problems Applied in LowLevel Image Processing IS  10 SN  10459219 SP865 EP877 EPD  865877 A1  Frank J. Seinstra, A1  Dennis Koelma, A1  Andrew D. Bagdanov, PY  2004 KW  Parallel processing KW  data communications aspects KW  optimization KW  image processing software. VL  15 JA  IEEE Transactions on Parallel and Distributed Systems ER   
Abstract—A popular approach to providing nonexperts in parallel computing with an easytouse programming model is to design a software library consisting of a set of preparallelized routines, and hide the intricacies of parallelization behind the library's API. However, for regular domain problems (such as simple matrix manipulations or lowlevel image processing applications—in which all elements in a regular subset of a dense data field are accessed in turn) speedup obtained with many such librarybased parallelization tools is often suboptimal. This is because interoperation optimization (or: timeoptimization of communication steps across library calls) is generally not incorporated in the library implementations. This paper presents a simple, efficient, finite state machinebased approach for communication minimization of librarybased data parallel regular domain problems. In the approach, referred to as
[1] A.D. Bagdanov and M. Worring, MultiScale Document Description Using Rectangular Granulometries Document Analysis Systems V, LNCS 2423, pp. 445456, Aug. 2002.
[2] H.E. Bal et al., The Distributed ASCI Supercomputer Project Operating Systems Rev., vol. 34, no. 4, pp. 7696, Oct. 2000.
[3] G. Baumgartner et al., A HighLevel Approach to Synthesis of HighPerformance Codes for Quantum Chemistry Proc. 2002 ACM/IEEE Conf. Supercomputing, pp. 110, Nov. 2002.
[4] S. Chatterjee, J. Gilbert, F. Long, R. Schreiber, and S. Teng, Generating Local Addresses and Communication Sets for Data Parallel Programs J. Parallel and Distributed Computing, vol. 26, no. 1, pp. 7284, Apr. 1995.
[5] J.M. Constantin, M.W. Berry, and B.T. Vander Zanden, Parallelization of the HoshenKopelman Algorithm Using a Finite State Machine Int'l J. Supercomputer Applications and High Performance Computing, vol. 11, no. 1, pp. 3145, Spring 1997.
[6] A. Darte, D. ChavarríaMiranda, R. Fowler, and J. MellorCrummey, Generalized Multipartitioning for MultiDimensional Arrays Proc. 16th Int'l Parallel and Distributed Processing Symp., Apr. 2002.
[7] M. Frigo and S.G. Johnson, “FFTW: An Adaptive Software Architecture for the FFT,” Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 3, p. 1381, 1998.
[8] J.M. Geusebroek, A.W.M. Smeulders, and H. Geerts, A Minimum Cost Approach for Segmenting Networks of Lines Int'l J. Computer Vision, vol. 43, no. 2, pp. 99111, July 2001.
[9] J.E. Hopcroft, R. Motwani, and J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, second ed. Addison Wesley, 2000.
[10] L.H. Jamieson, E.J. Delp, C.C. Wang, J. Li, and F.J. Weil, A Software Environment for Parallel Computer Vision Computer, vol. 25, no. 2, pp. 7375, Feb. 1992.
[11] Z. Juhasz and D. Crookes, A PVM Implementation of a Portable Parallel Image Processing Library Proc. EuroPVM '96, pp. 188196, Oct. 1996.
[12] K. Kennedy et al., Telescoping Languages: A Strategy for Automatic Generation of Scientific ProblemSolving Systems from Annotated Libraries J. Parallel and Distributed Computing, vol. 61, pp. 18031826, 2001.
[13] D. Koelma, P.P. Jonker, and H.J. Sips, A Software Architecture for Application Driven High Performance Image Processing Parallel and Distributed Methods for Image Processing, Proc. SPIE, vol. 3166, pp. 340351, July 1997.
[14] C. Lee and M. Hamdi, Parallel Image Processing Applications on a Network of Workstations Parallel Computing, vol. 21, no. 1, pp. 137160, Jan. 1995.
[15] C. Lee, Y.F. Wang, and T. Yang, Global Optimization for Mapping Parallel Image Processing Tasks on Distributed Memory Machines J. Parallel and Distributed Computing, vol. 45, no. 1, pp. 2945, Aug. 1997.
[16] P. Maurer, Logic Simulation Using Networks of State Machines Proc. Design, Automation and Test in Europe Conf. 2000 (DATE 2000), pp. 674678, Mar. 2000.
[17] MPI: A MessagePassing Interface Standard (version 1.1) Message Passing Interface Forum, technical report, Univ. of Tennessee, Knoxville, Tenn.,http:/www.mpiforum.org, June 1995.
[18] D. Milicev and Z. Jovanovic, A Finite State MachineBased Formal Model of Software Pipelined Loops with Conditions Int'l J. Computer Research, vol. 10, no. 1, pp. 1120, 2001.
[19] P.J. Morrow, D. Crookes, J. Brown, G. McAleese, D. Roantree, and I. Spence, Efficient Implementation of a Portable Parallel Programming Model for Image Processing Concurrency: Practice and Experience, vol. 11, pp. 671685, Sept. 1999.
[20] C. Nicolescu and P. Jonker, EASYPIPE An Easy to Use Parallel Image Processing Environment Based on Algorithmic Skeletons Proc. 15th Int'l Parallel and Distributed Processing Symp., Apr. 2001.
[21] C. Nicolescu and P. Jonker, A Data and Task Parallel Image Processing Environment Parallel Computing, vol. 28, nos. 78, pp. 945965, Aug. 2002.
[22] M. Prieto, I.M. Llorente, and F. Tirado, Data Locality Exploitation in the Decomposition of Regular Domain Problems IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 11, pp. 11411149, Nov. 2000.
[23] M. Püschel, B. Singer, M. Veloso, and J. Moura, Fast Automatic Generation of DSP Algorithms Proc. Int'l Conf. Computational Science, pp. 97106, 2001.
[24] C. van Reeuwijk, A.J.C. van Gemund, and H.J. Sips, Spar: A Programming Language for SemiAutomatic Compilation of Parallel Programs Concurrency: Practice and Experience, vol. 9, no. 11, pp. 11931205, Nov. 1997.
[25] F.J. Seinstra, User Transparent Parallel Image Processing PhD thesis, Intelligent Sensory Information Systems, Faculty of Science, Univ. of Amsterdam, The Netherlands, May 2003.
[26] F.J. Seinstra and D. Koelma, P3PC: A PointtoPoint Communication Model for Automatic and Optimal Decomposition of Regular Domain Problems IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 7, pp. 758768, July 2002.
[27] F.J. Seinstra and D. Koelma, User Transparency: A Fully Sequential Programming Model for Efficient Data Parallel Image Processing Concurrency and Computation: Practice and Experience, vol. 16, no. 6, pp. 611644, May 2004.
[28] F.J. Seinstra, D. Koelma, and A.D. Bagdanov, On the Correctness of Lazy Parallelization Technical Report Series, vol. 200401, Intelligent Sensory Information Systems, Faculty of Science, Univ. of Amsterdam, The Netherlands, Mar. 2004.
[29] F.J. Seinstra, D. Koelma, and J.M. Geusebroek, A Software Architecture for User Transparent Parallel Image Processing Parallel Computing, vol. 28, nos. 78, pp. 967993, Aug. 2002.
[30] B. Singer and M. Veloso, Learning to Construct Fast Signal Processing Implementations J. Machine Learning Research, vol. 3, pp. 887919, Dec. 2002.
[31] C. Soviany, Embedding Data and Task Parallelism in Image Processing Applications PhD thesis, Delft Univ. of Technology, The Netherlands, May 2003.
[32] J.M. Squyres, A. Lumsdaine, and R.L. Stevenson, A Toolkit for Parallel Image Processing Parallel and Distributed Methods for Image Processing II, Proc. SPIE, vol. 3452, July 1998.
[33] R. Taniguchi et al., Software Platform for Parallel Image Processing and Computer Vision Parallel and Distributed Methods for Image Processing, Proc. SPIE, vol. 3166, pp. 210, July 1997.
[34] J.A. Webb, Implementation and Performance of Fast Parallel MultiBaseline Stereo Vision Proc. 1993 DARPA Image Understanding Workshop, pp. 10051010, Apr. 1993.
[35] R.C. Whaley, A. Petitet, and J.J. Dongarra, Automated Empirical Optimization of Software and the ATLAS Project Parallel Computing, vol. 27, nos. 12, pp. 325, Jan. 2001.