Jump to content

Failure semantics

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Tony1 (talk | contribs) at 11:04, 17 December 2011 (fixed dashes using a script). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Failure semantics is a concept used in distributed computing to describe and classify errors that distributed systems can experience.[1][2]

Types of errors

A list of types of errors that can occur:

  • An omission error is when one or more responses fails.
    • A crash error is when nothing happens. A crash is a special case of omission when all responses fails.
  • A timing error is when one or more responses arrive outside the time interval specified. Timing errors can be early or late. An omission error is a timing error when a response has infinite timing error.
  • An arbitrary error is any error, (i.e. a wrong value or a timing error).
  • When a client uses a [[server (computing}|]] it can cope with different type errors from the server.
    • If it can manage a crash at the server it is said to assume the server to have crash failure semantics.
    • If it can manage a service omission it is said to assume the server to have omission failure semantics.
      • Failure semantics are the type of errors are expected to appear.
  • Should another type of error appear it will lead to a service failure because it cannot be managed.

References

  1. ^ Flaviu Christian, Understanding Fault-Tolerant Distributed Systems [1]
  2. ^ Arno Puder (2005). Distributed Systems Architecture. Morgan Kaufmann. ISBN 1558606483. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help), pp 14–16.