Message passing applications on a distributed computer require tools to integrate state saving and rollback, to support dynamic program reconfiguration, fault tolerance and others. The paper presents the results of integrating two independently developed tools that combine flexibility and portability. The User-Triggered CheckPointing (UTCP) provides checkpointing and recovery while relying on the programmer to indicate the position of the recovery line and the contents of the checkpoint. The tool PVMsnap provides an extension to PVM to obtain a consistent cut of the message passing application. The combination of both tools results in a portable and flexible solution for fault tolerance which can be adapted to the applications' need
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
Titolo: | A flexible state-saving library for message-passing systems |
Autori: | |
Data di pubblicazione: | 1998 |
Abstract: | Message passing applications on a distributed computer require tools to integrate state saving and rollback, to support dynamic program reconfiguration, fault tolerance and others. The paper presents the results of integrating two independently developed tools that combine flexibility and portability. The User-Triggered CheckPointing (UTCP) provides checkpointing and recovery while relying on the programmer to indicate the position of the recovery line and the contents of the checkpoint. The tool PVMsnap provides an extension to PVM to obtain a consistent cut of the message passing application. The combination of both tools results in a portable and flexible solution for fault tolerance which can be adapted to the applications' need |
Handle: | http://hdl.handle.net/11567/200605 |
ISBN: | 9780818683329 |
Appare nelle tipologie: | 04.01 - Contributo in atti di convegno |