|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks
Impact of temperature on hard disk drive reliability in large datacenters
Hong Kong, China
June 27-June 30
ISBN: 978-1-4244-9232-9
| ASCII Text | x | ||
| Sriram Sankar, Mark Shaw, Kushagra Vaid, "Impact of temperature on hard disk drive reliability in large datacenters," IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012), pp. 530-537, 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks, 2011. | |||
| BibTex | x | ||
| @article{ 10.1109/DSN.2011.5958265, author = {Sriram Sankar and Mark Shaw and Kushagra Vaid}, title = {Impact of temperature on hard disk drive reliability in large datacenters}, journal ={IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012)}, volume = {0}, year = {2011}, isbn = {978-1-4244-9232-9}, pages = {530-537}, doi = {http://doi.ieeecomputersociety.org/10.1109/DSN.2011.5958265}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012) TI - Impact of temperature on hard disk drive reliability in large datacenters SN - 978-1-4244-9232-9 SP530 EP537 A1 - Sriram Sankar, A1 - Mark Shaw, A1 - Kushagra Vaid, PY - 2011 VL - 0 JA - IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012) ER - | |||
When datacenters are pushed to their limits of operational efficiency, reducing failure rates becomes critical for maintaining high levels of healthy server operation. In this experience report, we present a dense storage case study from a large population of servers housing tens of thousands of disk drives. Previous studies have presented divergent results concerning correlation between temperature and hard disk drive failures. In our paper, we specifically establish correlation between temperatures and failures observed at different location granularities: a) inside drive locations in a server chassis, b) across server locations in a rack and c) across multiple racks in a datacenter. We also establish that temperature exhibits a stronger correlation to failures compared to the correlation of disk utilization with drive failures. Thus, we show that temperature-aware server and datacenter design plays a pivotal role in datacenter reliability. Following our case study, we present a reliability model for estimating hard disk drive failures correlated with the datacenter operating temperature. We use a physical Arrhenius model with empirically derived coefficients for our model. We show an application of the model for selecting the datacenter inlet temperature setpoint for two different server storage configurations. Finally, with the help of a datacenter cost discussion, we highlight the need to incorporate reliability-aware datacenter design for increased efficiency in large scale datacenters.
Citation:
Sriram Sankar, Mark Shaw, Kushagra Vaid, "Impact of temperature on hard disk drive reliability in large datacenters," dsn, pp.530-537, 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks, 2011
Usage of this product signifies your acceptance of the Terms of Use.
