• No storage. The system generates checkpoints but doesn't store them.
• Centralized repository. The system stores checkpoints in a centralized repository.
• Replication. The system stores one copy of the checkpoint locally and another in a remote repository.
• Parity over local checkpoints. The system breaks the checkpoint into 10 slices, with one containing parity information, and stores them in distributed repositories.
• IDA ( m = 9, k = 1) . The system codes the checkpoint into 10 slices, from which nine are sufficient for recovery, and stores them in distributed repositories.
• IDA ( m = 8, k = 2) . The system codes the checkpoint into 10 slices, from which eight are sufficient for recovery, and stores them in distributed repositories.
Raphael Y. de Camargo is a doctoral candidate in the Department of Computer Science at the University of São Paulo. His research interests include grid computing, fault tolerance, distributed storage, and distributed-object systems. He received his master's degree in physics from the University of São Paulo. Contact him at Rua Doutor Monteiro Tapajós, 70, São Paulo SP, Brazil, 04152040; firstname.lastname@example.org.
Renato Cerqueira is a research scientist and a project manager at the Computer Graphics Technology Group of the Pontifical Catholic University of Rio de Janeiro (PUC-Rio), where he is also an assistant professor of computer science. His research interests include component-based development, object-oriented languages, middleware platforms, distributed programming, ubiquitous computing, and grid computing. He received his PhD in computer science from PUC-Rio. He's a member of the ACM and IEEE Computer Society. Contact him at Computer Science Dept., PUC-Rio, Rua Marques de São Vicente 225, 407RDC, Rio de Janeiro RJ, Brazil 22453900; email@example.com.
Fabio Kon is an associate professor in the Department of Computer Science at the University of São Paulo. His research interests include distributed-object systems, reflective middleware, dynamic reconfiguration and adaptation, mobile agents, computer music, multimedia, grid computing, and agile methods. He received his PhD in computer science from the University of Illinois at Urbana-Champaign. He's a member of the ACM and the Hillside Group. Contact him at Departamento de Ciência da Computação, Rua do Matão, 1010, São Paulo SP, Brazil, 05508090; firstname.lastname@example.org.