Grid and Cloud Computing, International Conference on (2008)
Oct. 24, 2008 to Oct. 26, 2008
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/GCC.2008.21
Some burgeoning web applications, such as web search engines, need to track, store and analyze massive real-time users' access logs with high availability of 24*7. The traditional high availability approaches towards general-purpose transaction applications are usually not efficient or economical enough to store these high-rate insertion-only archived stream data. This paper proposes RAIDB5 (Redundant Array of Independent DataBases level 5) to store and manage these archived streams. We design the read-optimized and write-optimized algorithms which is distinct from the read-optimized ones in RAID5. These algorithms significantly improve the performance of write operations via classifying the data files of databases into live and history files. Based on a Markov model, we compare the performance, availability and cost of RAIDB5 with other traditional database backup modes. The analysis results show that RAIDB5 has better performance at the same cost and has the better performability and lower cost at the same system scale than the classical double replication mode.
database cluster, large-scale archived stream
Zhuxi Zhang, Kai Du, Huaimin Wang, Shuqiang Yang, "RAIDB5: An Economical and High Available Database Cluster for Large-Scale Archived Stream", Grid and Cloud Computing, International Conference on, vol. 00, no. , pp. 273-280, 2008, doi:10.1109/GCC.2008.21