Software fault tolerance

Software fault tolerance is the ability of computer software to continue its normal operation despite the presence of system or hardware faults.

Introduction

The only thing constant is change. This is certainly more true of software systems than almost any phenomenon,^[1] not all software change in the same way so software fault tolerance methods are designed to overcome execution errors by modifying variable values to create an acceptable program state.^[2] The need to control software fault is one of the most rising challenges facing software industries today. It is obvious that fault tolerance must be a key consideration in the early stage of software development.

Software Fault Tolerance Methodology

However there are different methodologies approaches in developing a well quality software tolerance from different angles by different researchers but the most widely used methods are:

N-Versioned Programming: this is based on hardware fault tolerance techniques which compare the output of duplicate hardware module run in parallel in an attempt to minimized coincidental failure caused by common logical flaws.
Recovery blocks: this is based on software adaptation of the hardware fault tolerance techniques in which a set of redundant block is created, each of which create a comparable results.
Robust data structure methods:This method is quite suitable compare to the N-Versioned and Recovery blocks because it allow user- defined structure such as list and trees, to be detected and corrected.^[3] Periodically a particular detection algorithm examines a structure for errors.

References

^ Eckhardt, D. E., "Fundamental Differences in the Reliability of N-Modular Redundancy and N-Version Programming", The Journal of Systems and Software, 8, 1988, pp. 313-318.
^ Ray Giguette and Johnette Hassell, “Toward A Resourceful Method of Software Fault Tolerance”, ACM Southeast regional conferenc, april, 1999.
^ Taylor, D. J., Morgan, D. E., and Black, J.P., "Redundancy in Data Structures: Improving Software Fault Tolerance",IEEE Transactions on Software Engineering, 6, 6, November 1980, pp.585-594

[1] Eckhardt, D. E., "Fundamental Differences in the Reliability of N-Modular Redundancy and N-Version Programming", The Journal of Systems and Software, 8, 1988, pp. 313-318.

[2] Ray Giguette and Johnette Hassell, “Toward A Resourceful Method of Software Fault Tolerance”, ACM Southeast regional conferenc, april, 1999.

[3] Taylor, D. J., Morgan, D. E., and Black, J.P., "Redundancy in Data Structures: Improving Software Fault Tolerance",IEEE Transactions on Software Engineering, 6, 6, November 1980, pp.585-594

[1]

[2]

[3]

Introduction

Software Fault Tolerance Methodology

See also

References