Apart from academic, recently more and more commercial coarse-grained reconfigurable arrays have been developed. Computational intensive applications from the area of video and wireless communication seek to exploit the computational power of such massively parallel SoCs. Conventionally, DSP processors are used in the digital signal processing domain. Thus, the existing compilation techniques are closely related to approaches from the DSP world. These approaches employ several loop transformations, like pipelining or temporal partitioning, but they are not able to exploit the full parallelism of a given algorithm and the computational potential of a typical 2-dimensional array.
In this paper, (i) we present an overview of constraints which have to be considered when mapping applications to coarse-grained reconfigurable arrays, (ii) we present our design methodology for mapping regular algorithms onto massively parallel arrays which is characterized by loop parallelization in the polytope model, and (iii), in a first case study, we adapt our design methodology for targeting reconfigurable arrays. The case study shows that the presented regular mapping methodology may lead to highly efficient implementations taking into account the constraints of the architecture.