The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January (2011 vol.60)
pp: 64-79
Suzhen Wu , Huazhong University of Science and Technology, Wuhan and Xiamen University, Xiamen
Hong Jiang , University of Nebraska-Lincoln, Lincoln
Dan Feng , Huazhong University of Science and Technology, Wuhan
Lei Tian , Huazhong University of Science and Technology, Wuhan and University of Nebraska-Lincoln, Lincoln
Bo Mao , Huazhong University of Science and Technology, Wuhan
ABSTRACT
Due to the contention for the shared disk bandwidth, the user I/O intensity can significantly impact the performance of the online low-priority background tasks, thus reducing the reliability and availability of RAID-structured storage systems. In this paper, we propose a novel and practical scheme, called WorkOut (I/O Workload Outsourcing), to significantly boost the performance of those low-priority background tasks. WorkOut effectively outsources all write requests and popular read requests originally targeted at the degraded RAID set that is performing the low-priority background tasks to a surrogate RAID set. The lightweight prototype implementation of WorkOut and extensive trace-driven and benchmark-driven experiments on two case studies demonstrate that, compared with existing approaches, WorkOut effectively improves the performance of the low-priority background tasks, such as RAID reconstruction and RAID resynchronization. Importantly, WorkOut is portable and can be easily incorporated into any existing optimizing algorithms for RAID-structured storage systems.
INDEX TERMS
Low-priority background tasks, RAID reconstruction, reliability, availability, performance evaluation.
CITATION
Suzhen Wu, Hong Jiang, Dan Feng, Lei Tian, Bo Mao, "Improving Availability of RAID-Structured Storage Systems by Workload Outsourcing", IEEE Transactions on Computers, vol.60, no. 1, pp. 64-79, January 2011, doi:10.1109/TC.2010.206
REFERENCES
[1] N. Agrawal, V. Prabhakaran, T. Wobber, J.D. Davis, M. Manasse, and R. Panigrahy, "Design Tradeoffs for SSD Performance," Proc. Ann. Technical Conf. (USENIX '08), June 2008.
[2] M. Arlitt and C. Williamson, "Web Server Workload Characterization: The Search for Invariants," Proc. Int'l Conf. Measurement and Modelling of Computer Systems (SIGMETRICS '96), May 1996.
[3] R. Arnan, E. Bachmat, T.K. Lam, and R. Michel, "Dynamic Data Reallocation in Disk Arrays," ACM Trans. Storage, vol. 3, no. 1, 2007.
[4] E. Bachmat and J. Schindler, "Analysis of Methods for Scheduling Low Priority Disk Drive Tasks," Proc. Int'l Conf. Measurement and Modelling of Computer Systems (SIGMETRICS '02), June 2002.
[5] L.N. Bairavasundaram, G.R. Goodson, S. Pasupathy, and J. Schindler, "An Analysis of Latent Sector Errors in Disk Drives," Proc. SIGMETRICS '07, June 2007.
[6] F. Chen, D.A. Koufaty, and X. Zhang, "Understanding Intrinsic Characteristics and System Implications of Flash Memory Based Solid State Drives," Proc. Int'l Joint Conf. Measurement and Modelling of Computer Systems (SIGMETRICS/Performance '09), June 2009.
[7] L. Cherkasova and G. Ciardo, "Characterizing Temporal Locality and Its Impact on Web Server Performance," Technical Report HPL-2000-82, Hewlett Packard Laboratories, July 2000.
[8] L. Cherkasova and M. Gupta, "Analysis of Enterprise Media Server Workloads: Access Patterns, Locality, Content Evolution, and Rates of Change," IEEE/ACM Trans. Networking, vol. 12, no. 5, pp. 781-794, Oct. 2004.
[9] T.E. Denehy, A.C. Arpaci-Dusseau, and R.H. Arpaci-Dusseau, "Journal-Guided Resynchronization for Software RAID," Proc. Conf. File and Storage Technologies (FAST '05), Dec. 2005.
[10] EMC Storage Products, http://www.emc.com/products/ category storage.htm, 2010.
[11] G. Gibson, "Reflections on Failure in Post-Terascale Parallel Computing. Keynote," Proc. Int'l Conf. Parallel Processing (ICPP '07), Sept. 2007.
[12] J. Gray, "Rules of Thumb in Data Engineering. Keynote Address," Proc. Int'l Conf. Data Eng. (ICDE '00), Feb. 2000.
[13] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, fourth ed. Morgan Kaufmann, 2006.
[14] M. Holland, "On-Line Data Reconstruction in Redundant Disk Arrays," PhD thesis, Carnegie Mellon Univ., Apr. 1994.
[15] M. Holland and G. Gibson, "Parity Declustering for Continuous Operation in Redundant Disk Arrays," Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS '92), Oct. 1992.
[16] R. Hou, J. Menon, and Y. Patt, "Balancing I/O Response Time and Disk Rebuild Time in a RAID5 Disk Array," Proc. Hawaii Int'l Conf. System Sciences (HICSS '93), 1993.
[17] R. Hou and Y. Patt, "Using Non-Volatile Storage to Improve the Reliability of RAID5 Disk Arrays," Proc. Int'l Symp. Fault-Tolerant Computing (FTCS '97), 1997.
[18] HP Disk Storage Systems, http://h18006.www1.hp.com/storage/disk_storage index.html, 2010.
[19] IBM Disk Storage Systems, http://www-03.ibm.com/systems/storagedisk /, 2010.
[20] I. Iliadis, R. Haas, X.-Y. Hu, and E. Eleftheriou, "Disk Scrubbing versus Intra-Disk Redundancy for High-Reliability RAID Storage System," Proc. Int'l Conf. Measurement and Modelling of Computer Systems (SIGMETRICS '08), June 2008.
[21] Iometer, http://sourceforge.net/projectsiometer, 2010.
[22] W. Jiang, C. Hu, Y. Zhou, and A. Kanevsky, "Are Disks the Dominant Contributor for Storage Failures? A Comprehensive Study of Storage Subsystem Failure Characteristics," Proc. Conf. File and Storage Technologies (FAST '08), Feb. 2008.
[23] S. Kang and A.L.N. Reddy, "User-Centric Data Migration in Networked Storage Systems," Proc. IEEE Int'l Symp. Parallel and Distributed Processing (IPDPS '08), Apr. 2008.
[24] H.H. Kari, H.K. Saikkonen, N. Park, and F. Lombardi, "Analysis of Repair Algorithms for Mirrored-Disk Systems," IEEE Trans. Reliability, vol. 46, no. 2, pp. 193-200, June 1997.
[25] A.J. Klosterman and G. Ganger, "Cukoo: Layered Clustering for NFS," Technical Report CMU-CS-02-183, Carnegie Mellon Univ., Oct. 2002.
[26] A. Krioukov, L.N. Bairavasundaram, G.R. Goodson, K. Srinivasan, R. Thelen, A.C. Arpaci-Dusseau, and R.H. Arpaci-Dusseau, "Parity Lost and Parity Regained," Proc. Conf. File and Storage Technologies (FAST '08), Feb. 2008.
[27] J.Y.B. Lee and J.C.S. Lui, "Automatic Recovery from Disk Failure in Continuous-Media Servers," IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 5, pp. 499-515, May 2002.
[28] Z. Li, Z. Chen, S.M. Srinivasan, and Y. Zhou, "C-Miner: Mining Block Correlations in Storage Systems," Proc. Conf. File and Storage Technologies (FAST '04), Mar. 2004.
[29] C. Lu, G.A. Alvarez, and J. Wilkes, "Aqueduct: Online Data Migration with Performance Guarantees," Proc. Conf. File and Storage Technologies (FAST '02), Jan. 2002.
[30] C.R. Lumb, J. Schindler, G.R. Ganger, D.F. Nagle, and E. Riedel, "Towards Higher Disk Head Utilization: Extracting Free Bandwidth from Busy Disk Drives," Proc. Symp. Operating Systems Design and Implementation (OSDI '00), Oct. 2000.
[31] M.P. Mesnier, M. Wachs, R.R. Sambasivan, J. Lopez, J. Hendricks, G.R. Ganger, and D. O'Hallaron, "//TRACE: Parallel Trace Replay with Approximate Causal Events," Proc. Conf. File and Storage Technologies (FAST '07), Feb. 2007.
[32] N. Mi, A. Riska, X. Li, E. Smirni, and E. Riedel, "Restrained Utilization of Idleness for Transparent Scheduling of Background Tasks," Proc. Int'l Joint Conf. Measurement and Modelling of Computer Systems (SIGMETRICS/Performance '09), June 2009.
[33] D. Narayanan, A. Donnelly, and A. Rowstron, "Write Off-Loading: Practical Power Management for Enterprise Storage," Proc. Conf. File and Storage Technologies (FAST '08), Feb. 2008.
[34] D. Narayanan, A. Donnelly, E. Thereska, S. Elnikety, and A. Rowstron, "Everest: Scaling Down Peak Loads Through I/O Off-Loading," Proc. Symp. Operating Systems Design and Implementation (OSDI '08), Dec. 2008.
[35] A. Oprea and A. Juels, "A Clean-Slate Look at Disk Scrubbing," Proc. Conf. File and Storage Technologies (FAST '10), Feb. 2010.
[36] J.-F. Pâris, A. Amer, and D.D.E. Long, "Using Storage Class Memories to Increase the Reliability of Two-Dimensional RAID Arrays," Proc. IEEE Int'l Conf. Modeling, Analysis and Simulation of Computer and Telecomm. Systems (MASCOTS '09), Sept. 2009.
[37] J. Piernas, T. Cortes, and J.M. García, "Tpcc-uva: A Free, Open-Source Implementation of the Tpc-c Benchmark," http://www.infor.uva.es/~diegotpcc-uva.html , 2005.
[38] E. Pinheiro, W.-D. Weber, and L.A. Barroso, "Failure Trends in a Large Disk Drive Population," Proc. Conf. File and Storage Technologies (FAST '07), Feb. 2007.
[39] A. Riska and E. Riedel, "Idle Read After Write—IRAW," Proc. Ann. Technical Conf. (USENIX '08), June 2008.
[40] M. Rosenblum and J.K. Ousterhout, "The Design and Implementation of a Log-Structured File System," Proc. ACM Symp. Operating Systems Principles (SOSP '91), Oct. 1991.
[41] B. Schroeder and G. Gibson, "Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?" Proc. Conf. File and Storage Technologies (FAST '07), Feb. 2007.
[42] B. Schroeder, A. Wierman, and M. Harchol-Balter, "Open versus Closed: A Cautionary Tale," Proc. Conf. Networked Systems Design and Implementation (NSDI '06), May 2006.
[43] T.J.E. Schwarz, Q. Xin, E.L. Miller, D.D.E. Long, A. Hospodor, and S. Ng, "Disk Scrubbing in Large Archival Storage Systems," Proc. IEEE Int'l Conf. Modeling, Analysis and Simulation of Computer and Telecomm. Systems (MASCOTS '04), Oct. 2004.
[44] M. Sivathanu, L.N. Bairavasundaram, A.C. Arpaci-Dusseau, and R.H. Arpaci-Dusseau, "Life or Death at Block-Level," Proc. Symp. Operating Systems Design and Implementation (OSDI '04), Dec. 2004.
[45] M. Sivathanu, V. Prabhakaran, F.I. Popovici, T.E. Denehy, A.C. Arpaci-Dusseau, and R.H. Arpaci-Dusseau, "Improving Storage System Availability with D-GRAID," Proc. Conf. File and Storage Technologies (FAST '04), Mar. 2004.
[46] Storage Performance Council, http://www.storageperformance. orghome, 2010.
[47] E. Thereska, J. Schindler, J. Bucy, B. Salmon, C.R. Lumb, and G.R. Ganger, "A Framework for Building Unobtrusive Disk Maintenance Applications," Proc. Conf. File and Storage Technologies (FAST '04), Apr. 2004.
[48] L. Tian, D. Feng, H. Jiang, K. Zhou, L. Zeng, J. Chen, Z. Wang, and Z. Song, "PRO: A Popularity-Based Multi-Threaded Reconstruction Optimization for RAID-Structured Storage Systems," Proc. Conf. File and Storage Technologies (FAST '07), Feb. 2007.
[49] L. Tian, H. Jiang, D. Feng, Q. Xin, and X. Shu, "Implementation and Evaluation of a Popularity-Based Reconstruction Optimization Algorithm in Availability-Oriented Disk Arrays," Proc. IEEE Conf. Mass Storage Systems and Technologies (MSST '07), Sept. 2007.
[50] TPC-C Specification, http://www.tpc.orgtpcc/, 2010.
[51] UMass Trace Repository, http://traces.cs.umass.edu/index.php/Storage Storage, 2010.
[52] M. Wachs, M. Abd-El-Malek, E. Thereska, and G.R. Ganger, "Argon: Performance Insulation for Shared Storage Servers," Proc. Conf. File and Storage Technologies (FAST '07), Feb. 2007.
[53] M. Wang, "Performance Modeling of Storage Devices using Machine Learning," PhD thesis, Carnegie Mellon Univ., Jan. 2006.
[54] C. Weddle, M. Oldham, J. Qian, A.A. Wang, P. Reiher, and G. Kuenning, "PARAID: The Gear-Shifting Power-Aware RAID," Proc. Conf. File and Storage Technologies (FAST '07), Feb. 2007.
[55] B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou, "Scalable Performance of the Panasas Parallel File System," Proc. Conf. File and Storage Technologies (FAST '08), Feb. 2008.
[56] S. Wu, D. Feng, H. Jiang, B. Mao, L. Zeng, and J. Chen, "JOR: A Journal-guided Reconstruction Optimization for RAID-Structured Storage Systems," Proc. Int'l Conf. Parallel and Distributed Systems (ICPADS '09), Dec. 2009.
[57] S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao, "WorkOut: I/O Workload Outsourcing for Boosting RAID Reconstruction Performance," Proc. Conf. File and Storage Technologies (FAST '09), Feb. 2009.
[58] T. Xie and H. Wang, "MICRO: A Multilevel Caching-Based Reconstruction Optimization for Mobile Storage Systems," IEEE Trans. Computers, vol. 57, no. 10, pp. 1386-1398, Oct. 2008.
[59] Q. Xin, E.L. Miller, and T.J.E. Schwarz, "Evaluation of Distributed Recovery in Large-Scale Storage Systems," Proc. IEEE Int'l Conf. High performance Distributed Computing (HPDC '04), June 2004.
[60] Q. Xin, E.L. Miller, T.J.E. Schwarz, D.D.E. Long, S.A. Brandt, and W. Litwin, "Reliability Mechanisms for Very Large Storage Systems," Proc. IEEE Conf. Mass Storage Systems and Technologies (MSST '03), Apr. 2003.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool