Message passing applications on a distributed computer require tools to integrate state saving and rollback, to support dynamic program reconfiguration, fault tolerance and others. The paper presents the results of integrating two independently developed tools that combine flexibility and portability. The User-Triggered CheckPointing (UTCP) provides checkpointing and recovery while relying on the programmer to indicate the position of the recovery line and the contents of the checkpoint. The tool PVMsnap provides an extension to PVM to obtain a consistent cut of the message passing application. The combination of both tools results in a portable and flexible solution for fault tolerance which can be adapted to the applications' need

A flexible state-saving library for message-passing systems

GIANUZZI, VITTORIA
1998-01-01

Abstract

Message passing applications on a distributed computer require tools to integrate state saving and rollback, to support dynamic program reconfiguration, fault tolerance and others. The paper presents the results of integrating two independently developed tools that combine flexibility and portability. The User-Triggered CheckPointing (UTCP) provides checkpointing and recovery while relying on the programmer to indicate the position of the recovery line and the contents of the checkpoint. The tool PVMsnap provides an extension to PVM to obtain a consistent cut of the message passing application. The combination of both tools results in a portable and flexible solution for fault tolerance which can be adapted to the applications' need
1998
9780818683329
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/200605
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact