Most replicated storage and file systems either take a specialized hardware approach or a software-oriented approach to fault tolerance. The paper describes a fault-tolerant disk storage and file system that falls in between the hardware and software categories. The system uses reflective memory to interconnect an array of standard computers comprising a massively parallel system. This architecture provides the basis for high availability replicated file and storage systems with the performance and low overhead expected from specialized hardware while offering the modularity and scalability of a distributed system. We describe the implementation of the fault-tolerant file and storage system to run large scale I/O-intensive applications, such as emulation of a stable storage DASD subsystem. Preliminary performance measurements indicate that selectively broadcasting regions of reflective memory allows for virtually no overhead over conventional systems for supporting replicated, distributed storage and file services.
Index Terms:
shared memory systems; replicated databases; distributed databases; software fault tolerance; magnetic disc storage; fault-tolerant disk storage; file systems; reflective memory; replicated storage; software-oriented approach; fault tolerance; standard computers; massively parallel system; high availability replicated file and storage systems; storage systems; large scale I/O-intensive applications; stable storage DASD subsystem; performance measurements; distributed storage; file services
Citation:
N. Vekiarides, "Fault-tolerant disk storage and file systems using reflective memory," hicss, pp.103, 28th Hawaii International Conference on System Sciences (HICSS'95), 1995