eCommons

 

Checkpointing and Rollback-Recovery for Distributed Systems

dc.contributor.authorKoo, Richarden_US
dc.contributor.authorToueg, Samen_US
dc.date.accessioned2007-04-23T17:11:14Z
dc.date.available2007-04-23T17:11:14Z
dc.date.issued1985-10en_US
dc.description.abstractWe consider the problem of bringing a distributed system to a consistent state after transient failures. We address the two components of this problem by describing a distributed algorithm to create consistent checkpoints, as well as a rollback-recovery algorithm to recover the system to a consistent state. In contrast to previous algorithms, they tolerate failures that occur during their executions. Furthermore, when a process takes a checkpoint, a minimal number of additional processes are forced to take checkpoints. Similarly, when a process rolls back and restarts after a failure, a minimal number of additional processes are forced to roll back with it. Our algorithms require each process to store at most two checkpoints in stable storage. This storage requirement is shown to be minimal under general assumptions.en_US
dc.format.extent2072117 bytes
dc.format.extent447350 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypeapplication/postscript
dc.identifier.citationhttp://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR85-706en_US
dc.identifier.urihttps://hdl.handle.net/1813/6546
dc.language.isoen_USen_US
dc.publisherCornell Universityen_US
dc.subjectcomputer scienceen_US
dc.subjecttechnical reporten_US
dc.titleCheckpointing and Rollback-Recovery for Distributed Systemsen_US
dc.typetechnical reporten_US

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
85-706.pdf
Size:
1.98 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
85-706.ps
Size:
436.87 KB
Format:
Postscript Files