The Community for Technology Leaders
2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2016)
Taipei, Taiwan
Oct. 15, 2016 to Oct. 19, 2016
ISBN: 978-1-5090-3509-0
pp: 1-13
Adrian M. Caulfield , Microsoft Corporation
Eric S. Chung , Microsoft Corporation
Andrew Putnam , Microsoft Corporation
Hari Angepat , Microsoft Corporation
Jeremy Fowers , Microsoft Corporation
Michael Haselman , Microsoft Corporation
Stephen Heil , Microsoft Corporation
Matt Humphrey , Microsoft Corporation
Puneet Kaur , Microsoft Corporation
Joo-Young Kim , Microsoft Corporation
Daniel Lo , Microsoft Corporation
Todd Massengill , Microsoft Corporation
Kalin Ovtcharov , Microsoft Corporation
Michael Papamichael , Microsoft Corporation
Lisa Woods , Microsoft Corporation
Sitaram Lanka , Microsoft Corporation
Derek Chiou , Microsoft Corporation
Doug Burger , Microsoft Corporation
Hyperscale datacenter providers have struggled to balance the growing need for specialized hardware (efficiency) with the economic benefits of homogeneity (manageability). In this paper we propose a new cloud architecture that uses reconfigurable logic to accelerate both network plane functions and applications. This Configurable Cloud architecture places a layer of reconfigurable logic (FPGAs) between the network switches and the servers, enabling network flows to be programmably transformed at line rate, enabling acceleration of local applications running on the server, and enabling the FPGAs to communicate directly, at datacenter scale, to harvest remote FPGAs unused by their local servers. We deployed this design over a production server bed, and show how it can be used for both service acceleration (Web search ranking) and network acceleration (encryption of data in transit at high-speeds). This architecture is much more scalable than prior work which used secondary rack-scale networks for inter-FPGA communication. By coupling to the network plane, direct FPGA-to-FPGA messages can be achieved at comparable latency to previous work, without the secondary network. Additionally, the scale of direct inter-FPGA messaging is much larger. The average round-trip latencies observed in our measurements among 24, 1000, and 250,000 machines are under 3, 9, and 20 microseconds, respectively. The Configurable Cloud architecture has been deployed at hyperscale in Microsoft's production datacenters worldwide.

A. M. Caulfield et al., "A cloud-scale acceleration architecture," 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Taipei, Taiwan, 2016, pp. 1-13.
273 ms
(Ver 3.3 (11022016))