Shadow computing: An energy-aware fault tolerant computing model

Bryan Mills, Taieb Znati, Rami Melhem

Research output: Contribution to conferencePaperpeer-review

23 Citations (Scopus)

Abstract

The current response to fault tolerance relies upon either time or hardware redundancy in order to mask faults. Time redundancy implies a re-execution of the failed computation after the failure has been detected, although this can further be optimized by the use of checkpoints these solutions still impose a significant delay. In many mission critical systems hardware redundancy has traditionally deployed in the form of process replication to provide fault tolerance, avoiding delay and maintaining tight deadlines. Both approaches have drawbacks, re-execution requiring additional time and replication requiring additional resources, especially energy. This forces the systems engineer to choose between time or hardware redundancy, cloud computing environments have largely chosen replication because response time is often critical. In this paper we propose a new computational model called shadow computing, which provides goal-based adaptive resilience through the use of dynamic execution. Using this general model we develop shadow replication which enables a parameterized tradeoff between time and hardware redundancy to provide fault tolerance. Then we build an analytical model to predict the expected energy savings and provide an analysis using that model.

Original languageEnglish
Pages73-77
Number of pages5
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event2014 International Conference on Computing, Networking and Communications, ICNC 2014 - Honolulu, HI, United States
Duration: Feb 3 2014Feb 6 2014

Conference

Conference2014 International Conference on Computing, Networking and Communications, ICNC 2014
Country/TerritoryUnited States
CityHonolulu, HI
Period2/3/142/6/14

Keywords

  • fault tolerance
  • resiliency
  • scheduling
  • shadow computing

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Shadow computing: An energy-aware fault tolerant computing model'. Together they form a unique fingerprint.

Cite this