Shared memory machines have two major overheads: keeping caches coherent and managing the multiple threads of computation which enable tolerance of very long memory latencies. Thetis is a hybrid architecture providing both shared memory and efficient message passing; it will be built from 'commodity' components and consists of sites with a small number of computation processors and a separate, programmable, auxiliary processor. The auxiliary processor performs overhead tasks, e.g. maintenance of cache directories, management of memory and thread queues; placed on its own bus, it does not block or delay memory accesses from computation processors which are not waiting for it. We describe how the use of a threaded variant of C (which has a functional style) enables the run-time system to dynamically determine coherence needs - dramatically reducing the overhead of maintaining coherent caches in a shared memory machine.
Citation:
J. Morris, R.R. Gregg, D. Herbert, J. McCoull, "Reducing Overheads in Distributed Shared Memory Systems," hicss, vol. 1, pp.244, 30th Hawaii International Conference on System Sciences (HICSS) Volume 1: Software Technology and Architecture, 1997