CMOS manufacturing technology has reached a state where physical limits of semiconductor-based microelectronics lead to serious heat dissipation and data synchronization problems. As a result, microprocessor clock speeds and straight-line instruction throughput have not significantly risen over the past few years. This has led to a revolutionary change in chip design characterized by multi-core architectures. In the near future, commercial-off-the-shelf (COTS) chips with tens or hundreds of processor cores will become the standard. As a consequence, parallel programming will no longer be restricted to the domain of high performance computing but will become a mainstream technology. Despite significant efforts in industry and academia, at present no generally accepted strategies exist for the programming and execution models of the emerging multi-level hierarchical systems and their programming environments.