Distributed shared memory is an architectural technique for providing a global view of memory in a distributed-store parallel machine by introducing mechanisms which make copies of remote areas of memory when required. One of the major problems of such a system is the performance penalties incurred due to the need to wait for areas of memory to be copied. This can be ameliorated to a certain extent using user annotations, compile-time analysis or run-time prediction to aid pre-fetching of data. This paper proposes a decoupled run-time technique for pre-fetching in a distributed shared memory environment which is applicable in circumstances where static analysis is difficult and the access patterns are sufficiently irregular that run-time prediction may fail. The proposal is in the form of a dual processor structure where one processor performs a partial evaluation of the program and thereby anticipates the need for data fetches before they are required by a second processor which performs the full evaluation.
Index Terms:
distributed memory systems; shared memory systems; memory architecture; parallel architectures; partial evaluation (compilers); distributed shared memory environment; decoupled pre-fetching; global view; parallel machine; remote memory copies; user annotations; compile-time analysis; run-time prediction; irregular access patterns; dual processor structure; partial program evaluation; data fetches
Citation:
I. Watson, A. Rawsthorne, "Decoupled pre-fetching for distributed shared memory," hicss, pp.252, 28th Hawaii International Conference on System Sciences (HICSS'95), 1995