This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
How Hadoop Clusters Break
PrePrint
ISSN: 0740-7459
Ariel Rabkin, UC Berkeley, Berkeley
Randy Katz, UC Berkeley, Berkeley
This article describes lessons from examining a sample of several hundred support tickets for the Hadoop ecosystem, a widely-used group of "big data" storage and processing systems. We give a taxonomy of errors and describe how they are addressed by supporters today. We show that misconfigurations are the dominant cause of failures. We describe these misconfigurations in detail. Using these failure reports, we identify some of the design "anti-patterns" and missing platform features that contribute to the problems we observed. We offer advice to developers about how to build more robust distributed systems. We also advise users and administrators how to avoid some of the rough edges we found.
Citation:
Ariel Rabkin, Randy Katz, "How Hadoop Clusters Break," IEEE Software, 07 June 2012. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/MS.2012.73>
Usage of this product signifies your acceptance of the Terms of Use.