2016 IEEE International Conference on Cluster Computing (CLUSTER) (2016)
Sept. 12, 2016 to Sept. 16, 2016
Compute clusters, consisting of many, uniformly built nodes, are used to run a large spectrum of different workloads, like tightly coupled (MPI) jobs, MapReduce, or graph-processing data-analytics applications, each of which with their own resource requirements. Many studies consistently highlight two types of under-utilized cluster resources: memory (up to 50%) and network. In this work, we take a step towards (software) resource disaggregation, and therefore increased resource utilization, by designing a memory scavenging technique that makes unused memory available to applications on other cluster nodes. We implement this technique in MemFSS, an in-memory distributed file system. The scavenging MemFSS extends its storage space by taking advantage of the unused memory and bandwidth of cluster nodes already running other tenants' applications. Our experiments show that our memory scavenging approach incurs negligible overhead (below 10%) for most tenant applications, while the compute resource comsumption of MemFSS applications is largely reduced (by 17%-74%).
Bandwidth, Memory management, Resource management, Parallel processing, Protocols, Servers, Scalability
A. Uta, A. Oprescu and T. Kielmann, "Towards Resource Disaggregation — Memory Scavenging for Scientific Workloads," 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan, 2016, pp. 100-109.