Issue No. 11 - November (2010 vol. 59)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2010.61
Heeseung Jo , Korea Advanced Institute of Science and Technology, Daejeon
Hwanju Kim , Korea Advanced Institute of Science and Technology, Daejeon
Jae-Wan Jang , Korea Advanced Institute of Science and Technology, Daejeon
Joonwon Lee , SungKyunKwan University, Suwon
Seungryoul Maeng , Korea Advanced Institute of Science and Technology, Daejeon
In a consolidated server system using virtualization, physical device accesses from guest virtual machines (VMs) need to be coordinated. In this environment, a separate driver VM is usually assigned to this task to enhance reliability and to reuse existing device drivers. This driver VM needs to be highly reliable, since it handles all the I/O requests. This paper describes a mechanism to detect and recover the driver VM from faults to enhance the reliability of the whole system. The proposed mechanism is transparent in that guest VMs cannot recognize the fault and the driver VM can recover and continue its I/O operations. Our mechanism provides a progress monitoring-based fault detection that is isolated from fault contamination with low monitoring overhead. When a fault occurs, the system recovers by switching the faulted driver VM to another one. The recovery is performed without service disconnection or data loss and with negligible delay by fully exploiting the I/O structure of the virtualized system.
Reliability, disconnected operation, fault tolerance, high availability.
H. Jo, S. Maeng, J. Lee, H. Kim and J. Jang, "Transparent Fault Tolerance of Device Drivers for Virtual Machines," in IEEE Transactions on Computers, vol. 59, no. , pp. 1466-1479, 2010.