Issue No. 03 - March (1973 vol. 22)
W.G. Bouricius , T. J. Watson Research Center, IBM Corporation
This paper reports a study on the design and modeling of a highly reliable bubble-memory system. This system has the capability of correcting a single 16-adjacent bit-group error resulting from failures in a single basic storage module (BSM), and detecting with a probability greater than 0.99 any double errors resulting from failures in BSM's. The encoding/decoding network (memory translator) is designed to be self-checking, i.e., a single circuit failure in the translator wiH not produce an erroneous output that goes undetected. The system is able to perform reliable configuration in the event of uncorrectable BSM failures, memory translator failures, and dual-memory buffer failures; even in the presence of a single failure in the status registers controlling the configuration network. The bubble memory under study permits serial accessing of the store with 64 x 1024 bit blocks at a 100-kHz rate. The objective of this study is to develop good fault-tolerant design and analysis methods adequate for newly emerging technologies and prove the practicality by example. The reliability modeling study justifies the design philosophy adopted of employing memory data encoding and a translator to correct single group errors and detect double group errors to enhance the overall system reliability. By a proper design of the memory translator based on a new checking technique, a uniformly high percentage of multiple b-adjacent bit-group error detection is achieved through the use of a proposed code (detects 99.99695 percent of double b-adjacent bit-group errors and 99.9985 percent of triple or more b-adjacent bit-group errors).
Block-oriented memory, bubble memory, error-correcting codes, memory translators, reliability modeling, self-checking translators, standby sparing, switching algorithm.
W.C. Carter, E.P. Hsieh, W.G. Bouricius, A.B. Wadia, D.C. Jessep, "Modeling of a Bubble-Memory Organization with Self-Checking Translators to Achieve High Reliability", IEEE Transactions on Computers, vol. 22, no. , pp. 269-275, March 1973, doi:10.1109/T-C.1973.223706