11th International Symposium on High-Performance Computer Architecture (HPCA'05)
An Efficient Programmable 10 Gigabit Ethernet Network Interface Card
San Francisco, California
February 12-February 16
ISBN: 0-7695-2275-0
This paper explores the hardware and software mechanisms necessary for an efficient programmable 10 Gigabit Ethernet network interface card. Network interface processing requires support for the following characteristics: a large volume of frame data, frequently accessed frame metadata, and high frame rate processing. This paper proposes three mechanisms to improve programmable network interface efficiency. First, a partitioned memory organization enables low-latency access to control data and high-bandwidth access to frame contents from a high-capacity memory. Second, a novel distributed task-queue mechanism enables parallelization of frame processing across many low-frequency cores, while using software to maintain total frame ordering. Finally, the addition of two new atomic read-modify-write instructions reduces frame ordering overheads by 50%. Combining these hardware and software mechanisms enables a network interface card to saturate cores and 4 banks of on-chip SRAM operating at a full-duplex 10 Gb/s Ethernet link by utilizing 6 processor 166 MHz, along with external 500 MHz GDDR SDRAM.
Citation:
Paul Willmann, Hyong-youb Kim, Scott Rixner, Vijay S. Pai, "An Efficient Programmable 10 Gigabit Ethernet Network Interface Card," hpca, pp.96-107, 11th International Symposium on High-Performance Computer Architecture (HPCA'05), 2005