This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Detection and Recovery Techniques for Database Corruption
September/October 2003 (vol. 15 no. 5)
pp. 1120-1136

Abstract—Increasingly, for extensibility and performance, special purpose application code is being integrated with database system code. Such application code has direct access to database system buffers, and as a result, the danger of data being corrupted due to inadvertent application writes is increased. Previously proposed hardware techniques to protect from corruption require system calls, and their performance depends on details of the hardware architecture. We investigate an alternative approach which uses codewords associated with regions of data to detect corruption and to prevent corrupted data from being used by subsequent transactions. We develop several such techniques which vary in the level of protection, space overhead, performance, and impact on concurrency. These techniques are implemented in the Dalí main-memory storage manager, and the performance impact of each on normal processing is evaluated. Novel techniques are developed to recover when a transaction has read corrupted data caused by a bad write and gone on to write other data in the database. These techniques use limited and relatively low-cost logging of transaction reads to trace the corruption and may also prove useful when resolving problems caused by incorrect data entry and other logical errors.

[1] B. Bershad, T.E. Anderson, E.D. Lazowska, and H.M. Levy, Lightweight Remote Procedure Call ACM Trans. Computer Systems, vol. 8, no. 1, pp. 37-55, Feb. 1990.
[2] L.A. BjorkJr., Generalized Audit Trail Requirements and Concepts for Data Base Applications IBM Systems J., vol. 14, no. 3, pp. 229-245, 1975.
[3] P. Bohannon, D. Lieuwen, R. Rastogi, S. Seshadri, A. Silberschatz, and S. Sudarshan, The Architecture of the DalíMain-Memory Storage Manager J. Multimedia Tools and Applications, vol. 4, no. 2, pp. 115-151, Mar. 1997.
[4] P. Bohannon, J. Parker, R. Rastogi, S. Seshadri, A. Silberschatz, and S. Sudarshan, Distributed Multi-Level Recovery in a Main-Memory Database Proc. Fourth Int'l Conf. Parallel and Distibuted Information Systems, 1996.
[5] B. Bershad, S. Savage, P. Pardyak, E. Sirer, M. Fiuczynski, D. Becker, C. Chambers, and S. Eggers, “Extensibility, Safety and Performance in the SPIN Operating System,” Proc. Symp. Operating Systems Principles, pp. 267–284, 1995.
[6] C.T. DaviesJr., Data Processing Spheres of Control IBM Systems J., vol. 17, no. 2, pp. 179-198, 1978.
[7] D.J. DeWitt, R.H. Katz, F. Olken, L.D. Shapiro, and M.R. Stonebraker, “Implementation Techniques for Main Memory Database Systems,” Proc. ACM SIGMOD, 1984.
[8] V. Gottemukkala and T. Lehman, Locking and Latching in a Memory-Resident Database System Proc. Int'l Conf. Very Large Databases, pp. 533-544, Aug. 1992.
[9] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques, Morgan Kauffman, 1993.
[10] J. Gray, "A Census of Tandem System Availability Between 1985 and 1990," IEEE Trans. Reliability, vol. 39, no. 4, pp. 409-418, Oct. 1990.
[11] H.V. Jagadish, D. Lieuwen, R. Rastogi, A. Silberschatz, and S. Sudarshan, Dalí: A High Performance Main-Memory Storage Manager Proc. Int'l Conf. Very Large Databases, 1994.
[12] H.V. Jagadish, A. Silberschatz, and S. Sudarshan, Recovering from Main-Memory Lapses Proc. Int'l Conf. Very Large Databases, 1993.
[13] K. Kuspert, Principles of Error Detection in Storage Structures of Database Systems Reliability Eng.: An Int'l J., vol. 14, 1986.
[14] D. Leinbaugh, Personal Communication, Nov. 1994.
[15] D.B. Lomet, “MLR: A Recovery Method for Multi-Level Systems,” Proc. ACM-SIGMOD Int'l Conf. Management of Data, pp. 185–194, June 1992.
[16] K. Loney, ORACLE8 DBA Handbook. Osborne McGraw-Hill, 1998.
[17] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz, ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging ACM Trans. Database Systems, vol. 17, no. 1, pp. 94-162, Mar. 1992.
[18] D. Morgan and D. Taylor, A Survey of Methods for Achieving Reliable Software Computer, vol. 10, no. 2, Feb. 1977.
[19] W.T. Ng and P.M. Chen, Integrating Reliable Memory in Databases Proc. Int'l Conf. Very Large Databases, pp. 76-85, Aug. 1997.
[20] J. Ousterhout, Why Aren't Operating Systems Getting Faster as Fast as Hardware Proc. USENIX Summer 1990 Conf., pp. 247-256, 1990.
[21] B. Randell, System Structure for Software Fault Tolerance Computer, vol. 10, no. 2, Feb. 1977.
[22] The Postgres Papers technical report, UCB, Elec.Res.Lab, Memo No. M86-85, M. Stonebraker and L.A. Rowe, eds., rev. Jun. 1987, Nov. 1986.
[23] M. Sullivan and M. Stonebreaker, Using Write Protected Data Structures to Improve Software Fault Tolerance in Highly Available Database Management Systems Proc. Int'l Conf. Very Large Databases, pp. 171-179, 1991.
[24] M. Sullivan, System Support for Software Fault Tolerance in Highly Available Database Management Systems Technical Report ERL-93-5, Univ. California, Berkeley, Jan. 1993.
[25] D. Taylor, D. Morgan, and J. Black, Redundancy in Data Structures: Improving Software Fault Tolerance IEEE Trans. Software Eng., vol. 6, no. 6, pp. 585-594, Nov. 1980.
[26] D. Taylor, D. Morgan, and J. Black, Redundancy in Data Structures: Some Theoretical Results IEEE Trans. Software Eng., vol. 6, no. 6, pp. 595-602, Nov. 1980.
[27] G. Weikum, C. Hasse, P. Broessler, and P. Muth, “Multi-Level Recovery,” Proc. Ninth ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 109–123, April 1990.
[28] R. Wahbe, S. Lucco, T. Anderson, and S. Graham, Efficient Software-Based Fault Isolation Proc. 14th ACM Symp. Operating System Principles, pp. 203-216, Dec. 1993.

Index Terms:
Database corruption, database recovery, fault tolerance, data integrity, main-memory database.
Citation:
Philip Bohannon, Rajeev Rastogi, S. Seshadri, Avi Silberschatz, S. Sudarshan, "Detection and Recovery Techniques for Database Corruption," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 5, pp. 1120-1136, Sept.-Oct. 2003, doi:10.1109/TKDE.2003.1232268
Usage of this product signifies your acceptance of the Terms of Use.