This Article 
 Bibliographic References 
 Add to: 
RAPID-Cache-A Reliable and Inexpensive Write Cache for High Performance Storage Systems
March 2002 (vol. 13 no. 3)
pp. 290-307

Modern high performance disk systems make extensive use of nonvolatile RAM (NVRAM) write caches. A single-copy NVRAM cache creates a single point of failure while a dual-copy NVRAM cache is very expensive because of the high cost of NVRAM. This paper presents a new cache architecture called RAPID-Cache for Redundant, Asymmetrically Parallel, and Inexpensive Disk Cache. A typical RAPID-Cache consists of two redundant write buffers on top of a disk system. One of the buffers is a primary cache made of RAM or NVRAM and the other is a backup cache containing a two-level hierarchy: a small NVRAM buffer on top of a log disk. The small NVRAM buffer combines small write data and writes them into the log disk in large sizes. By exploiting the locality property of I/O accesses and taking advantage of well-known Log-structured File Systems, the backup cache has nearly equivalent write performance as the primary RAM cache. The read performance of the backup cache is not as critical because normal read operations are performed through the primary RAM cache and reads from the backup cache happen only during error recovery periods. The RAPID-Cache presents an asymmetric architecture with a fast-write-fast-read RAM being a primary cache and a fast-write-slow-read NVRAM-disk hierarchy being a backup cache. The asymmetrically parallel architecture and an algorithm that separates actively accessed data from inactive data in the cache virtually eliminate the garbage collection overhead, which are the major problems associated with previous solutions such as Log-structured File Systems and Disk Caching Disk. The asymmetric cache allows cost-effective designs for very large write caches for high-end parallel disk systems that would otherwise have to use dual-copy, costly NVRAM caches. It also makes it possible to implement reliable write caching for low-end disk I/O systems since the RAPID-Cache makes use of inexpensive disks to perform reliable caching. Our analysis and trace-driven simulation results show that the RAPID-Cache has significant reliability/cost advantages over conventional single NVRAM write caches and has great cost advantages over dual-copy NVRAM caches. The RAPID-Cache architecture opens a new dimension for disk system designers to exercise trade-offs among performance, reliability, and cost.

[1] Y. Hu, Q. Yang, and T. Nightingale, “RAPID-Cache—A Reliable and Inexpensive Write Cache for Disk I/O Systems,” Proc. Fifth Int'l Symp. High Performance Computer Architecture (HPCA-5), pp. 204-213, Jan. 1999.
[2] J. Menon and J. Cortney, "The Architecture of a Fault-Tolerant Cached RAID Controller," Proc. 20th Int'l Symp. Computer Architecture, pp. 76-86,San Diego, Calif., May 1993.
[3] K. Treiber and J. Menon, "Simulation Study of Cached RAID5 Designs," Proc. First IEEE Symp. High Performance Computer Architecture, pp. 186-197. (also IBM Research Report RJ 9823, Almaden Research Center, Calif., May 1994).
[4] P.M. Chen, E.K. Lee, G.A. Gibson, R.H. Katz, and D.A. Patterson, "RAID: High-Performance Reliable Secondary Storage," ACM Computing Surveys, vol. 36, no. 3, pp. 145-185, Aug. 1994.
[5] A. Varma and Q. Jacobson, "Destage Algorithms for Disk Arrays with Non-Volatile Caches," Proc. 22nd Ann. Int'l Symp. Computer Architecture, ACM Press, 1995, pp. 83-95.
[6] C. Ruemmler and J. Wilkes, “UNIX Disk Access Patterns,” Proc. Winter 1993 USENIX, pp. 405-420, Jan. 1993.
[7] S. Savage and J. Wilkes, “AFRAID—A Frequently Redundant Array of Independent Disks,” Proc. 1996 USENIX Technical Conf., Jan. 1996.
[8] M. Wu and W. Zwaenepoel, “eNVy: A Non-Volatile, Main Memory Storage System,” Proc. 1994 Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), Oct. 1994.
[9] S. Akyurek and K. Salem, “Management of Partially Safe Buffers,” IEEE Trans. Computers, vol. 44, no. 3, pp. 394-407, Mar. 1995.
[10] D. Coombs, “Drawing up a New RAID Roadmap,” Data Storage, vol. 3, pp. 59-61, Dec. 1996.
[11] J. Ousterhout and F. Douglis, “Beating the I/O Bottleneck: A Case for Log-Structured File Systems,” technical report, Computer Science Division, Electrical Eng. and Computer Sciences, Univ. of California at Berkeley, Oct. 1988.
[12] M. Rosenblum and J.K. Ousterhout, "The Design and Implementation of a Log-Structured File System," ACM Trans. Computer Systems, vol. 10, no. 1, Feb. 1992.
[13] M. Seltzer, K. Bostic, M.K. McKusick, and C. Staelin, “An Implementation of a Log-Structured File System for UNIX,” Proc. Winter 1993 USENIX, pp. 307-326, Jan. 1993.
[14] D. Stodolsky et al., "Parity Logging Disk Arrays," ACM Trans. Computer Systems, Vol. 12, No. 3, Aug. 1994, pp. 206-235.
[15] B.T. Zivkov and A.J. Smith, “Disk Caching in Large Databases and Timeshared Systems,” Technical Report CSD-96-913, Computer Science Division, Univ. of California, Berkeley, Sept. 1996.
[16] Y. Hu and Q. Yang, “DCD—Disk Caching Disk: A New Approach for Boosting I/O Performance,” Proc. 23rd Int'l Symp. Computer Architecture (ISCA '96), pp. 169-178, May 1996.
[17] J. Wilkes, R. Golding, C. Staelin, and T. Sullivan, The HP Auto RAID Hierarchical Storage System ACM Trans. Computer Systems, vol. 14, pp. 108-136, Feb. 1996.
[18] D. Kotz, S.B. Toh, and S. Radhakrishnan, “A Detailed Simulation Model of the hp 97560 Disk Drive,” Technical Report PCS-TR94-220, Darthmouth College, 1994.
[19] C. Ruemmler and J. Wilkes, "An Introduction to Disk Drive Modeling," Computer, vol. 27, no. 3, pp. 17-28, Mar. 1994.
[20] G.R. Ganger, “Generating Representative Synthetic Workloads—An Unsolved Problem,” Proc. Computer Measurement Group (CMG) Conf., pp. 1263-1269, Dec. 1995.
[21] “DS1270Y/AB 16M Nonvolatile SRAM data sheet.” Dallas Semiconductor,http://pdfserv.maix-ic.comarpdfDS1270AB-DS1270Y.pdf 2001.
[22] G. Gibson and D. Patterson,“Designing disk arrays for high data reliability,” J. Parallel and Distributed Computing, pp. 4-27, Jan. 1993.
[23] W.T. Ng and P.M. Chen, “The Design and Verification of the Rio File Cache,” IEEE Trans. Computers, vol. 50, no. 4, pp. 1-16, Apr. 2001.
[24] W.T. Ng and P.M. Chen, Integrating Reliable Memory in Databases Proc. Int'l Conf. Very Large Databases, pp. 76-85, Aug. 1997.
[25] J. Menon, A Performance Comparison of RAID-5 and Log-Structured Arrays Proc. Fourth IEEE Int'l Symp. High Performance Distributed Computing, pp. 167-178, Aug. 1995.
[26] M. Seltzer, K.A. Smith, H. Balakrishnan, J. Chang, S. McMains, and V. Padmanabhan, “File System Logging versus Clustering: A Performance Comparison,” Proc. USENIX 1995 Technical Conf., pp. 249-264, Jan. 1995.

Index Terms:
disks, storage systems, performance, reliability, fault-tolerance
Y. Hu, T. Nightingale, Q. Yang, "RAPID-Cache-A Reliable and Inexpensive Write Cache for High Performance Storage Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 3, pp. 290-307, March 2002, doi:10.1109/71.993208
Usage of this product signifies your acceptance of the Terms of Use.