Adaptive Recovery with Hierarchical Checkpointing on Workstation Clusters

Author: 周志賢 , Chi-Yin Edward Chow
Publisher: Open Dissertation Press
ISBN:

9781374719316

Publication Date: 27 January 2017
Format: Hardback
Availability: Temporarily unavailable

The supplier advises that this item is temporarily unavailable. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out to you.

Our Price $155.76 Quantity:

Share |

Overview

This dissertation, Adaptive Recovery With Hierarchical Checkpointing on Workstation Clusters by 周志賢, Chi-yin, Edward, Chow, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: Abstract of thesis entitled Adaptive Recovery with Hierarchical Checkpointing on Workstation Clusters Submitted by Edward, Chi-yin Chow for the degree of Master of Philosophy at the University of Hong Kong in August 1999 Abstract: Fault tolerance is inevitable in persisting the scalability limits with reliable network- based computing systems. This thesis proposes adaptive checkpointing and recovery scheme for both disk-based and diskless checkpointing with reduced recovery latency and performance overhead. The purpose is to build the fault tolerance and recovery capability of clusters with a minimum architectural upgrade. Improving from traditional disk-based checkpointing which stores checkpoints in local disk, this thesis proposes a hierarchical checkpointing scheme for adaptive rollback recovery. Checkpoints at three architectural levels are suggested. The costs of checkpointing at various levels are characterized and analyzed quantitatively. Guidelines to optimize the checkpoint hierarchy are given. Potential performance gains of the fault tolerant cluster design are presented and drawbacks are also discussed. Improving from Li and Plank's diskless checkpointing and Vaidya's 2-level approach, this thesis has developed an interleaved checkpointing scheme for adaptive recovery. The idea is based on interleaved mirroring in network memory and in stable storage. This interleaved scheme provides a wider fault coverage than traditional schemes and is designed to tolerate single, double and some multiple faults, adaptively. Through theoretical analysis, we prove that much reduced latency is expected in this adaptive recovery from the most often encountered failure types. Using Markov cost model, the checkpoint overheads and recovery latency in the various schemes are quantified. Our results reveal their relative performance effects on the cluster size, interleaving degree, job length, and failure rate. Possible implementations of these adaptive schemes are discussed with a tradeoff study. These schemes appeal especially to the construction of large-scale Unix workstation clusters or Beowulf PC/Linux clusters. DOI: 10.5353/th_b2981291 Subjects: Fault-tolerant computingClient/server computingComputer networks

Full Product Details

Author: 周志賢 , Chi-Yin Edward Chow
Publisher: Open Dissertation Press
Imprint: Open Dissertation Press
Dimensions: Width: 21.60cm , Height: 1.00cm , Length: 27.90cm
Weight: 0.608kg
ISBN:

9781374719316

ISBN 10: 1374719315
Publication Date: 27 January 2017
Audience: General/trade , General
Format: Hardback
Publisher's Status: Active
Availability: Temporarily unavailable

The supplier advises that this item is temporarily unavailable. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out to you.

Reviews

Author Information

Tab Content 6

Author Website:

Countries Available

All regions

Latest Reading Guide

Shopping Cart

Your cart is empty

Mailing List

Adaptive Recovery with Hierarchical Checkpointing on Workstation Clusters

9781374719316

Availability Information

Overview

Full Product Details

9781374719316

Table of Contents

Reviews

Author Information

Tab Content 6

Countries Available

Sign up now