The Community for Technology Leaders
RSS Icon
Issue No.02 - July-Dec. (2012 vol.11)
pp: 53-56
Christina Delimitrou , Stanford University, Stanford
Sriram Sankar , Microsoft, Seattle
Kushagra Vaid , Microsoft, Seattle
Christos Kozyrakis , Stanford University, Stanford
Suboptimal storage design has significant cost and power impact in large-scale datacenters (DCs). Performance, power and cost-optimized systems require deep understanding of target workloads, and mechanisms to effectively model different storage design choices. Traditional benchmarking is invalid in cloud data-stores, representative storage profiles are hard to obtain, while replaying applications in different storage configurations is impractical both in cost and time. Despite these issues, current workload generators are not able to reproduce key aspects of real application patterns (e.g., spatial/temporal locality, I/O intensity). In this paper, we propose a modeling and generation framework for large-scale storage applications. As part of this framework we use a state diagram-based storage model, extend it to a hierarchical representation, and implement a tool that consistently recreates DC application I/O loads. We present the principal features of the framework that allow accurate modeling and generation of storage workloads, and the validation process performed against ten original DC application traces. Finally, we explore two practical applications of this methodology: SSD caching and defragmentation benefits on enterprise storage. Since knowledge of the workload's spatial and temporal locality is necessary to model these use cases, our framework was instrumental in quantifying their performance benefits. The proposed methodology provides detailed understanding of the storage activity of large-scale applications, and enables a wide spectrum of storage studies, without the requirement to access application code and full application deployment.
Load modeling, Very large scale integration, Throughput, Storage area networks, Computational modeling, Electronic mail, Generators, Modeling techniques, Modeling of computer architecture, Super (very large) computers, Mass storage
Christina Delimitrou, Sriram Sankar, Kushagra Vaid, Christos Kozyrakis, "Decoupling Datacenter Storage Studies from Access to Large-Scale Applications", IEEE Computer Architecture Letters, vol.11, no. 2, pp. 53-56, July-Dec. 2012, doi:10.1109/L-CA.2011.37
1. Adaptec MaxIQ. 32GB SSD Cache Performance Kit. MaxIQSSD-Cache-Performance /
2. I. Ahmad., “Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server”. In Proc. of IEEE IISWC, Boston, MA, 2007.
3. C. Delimitrou,S. Sankar,K. Vaid, and C. Kozyrakis., “Accurate Modeling and Generation of Storage I/O for Datacenter Workloads”. Proc. of EXERT, CA, March 2011.
4. DiskSpd: File and Network I/O using Win32 and .NET API's on Windows XP projectssequentialio/
5. ETW: Event Tracing for Windows:
6. IOMeter, performance analysis tool.
7. C. Kozyrakis,A. Kansal,S. Sankar,K. Vaid,“Server Engineering Insights for Large-Scale online Services”. In IEEE micro volume 30, no. 4, July 2010.
8. D. Narayanan,E. Thereska,A. Donnelly,S. Elnikety,, and A. Rowstron,“Migrating enterprise storage to SSDs: analysis of tradeoffs”. In Proc. of EuroSys, Nuremberg, 2009.
9. S. Sankar,K. Vaid,“Storage characterization for unstructured data in online services applications”. In Proc. IEEE IISWC, TX, 2009.
10. S. Sankar,K. Vaid., “Addressing the stranded power problem in datacenters using storage workload characterization”. In Proc. of the first WOSP/SIPEW, San Jose, CA, 2010.
11. SQLIO Disk Subsystem Benchmark Tool.
12. H. Vandenbergh. Vdbench: User Guide 5.00 Oct. 2008.
37 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool