<p>Processors that target throughput computing often have many cores, which stresses the cache hierarchy. Logically centralized, shared data storage is needed for many-core chips to provide high cache throughput for heavily read-write shared lines. Techniques to reduce on-die and off-die traffic have a dramatic energy benefit for many-core chips.</p>
