Content uploaded by George Popov
Author content
All content in this area was uploaded by George Popov on Feb 24, 2016
Content may be subject to copyright.
Using Diversity as Fault Tolerance Tool
George Popov
Technical University of Sofia, Bulgaria
email: popovg@tu-sofia.bg
Abstract: Diversity is a known approach for increasing reliability of computer systems. The
goal of this work is to present properties of diversity as fail safe and fault tolerance tool and give
quantitative criteria for measure of diversity. For this purpose, the model of diversity-based sys-
tem with two failure types: detectable and undetectable is presented and a formula to calculate
it is proposed
Keywords: diversity, dependability, computer system, embedded system, fail-safe, fault-
tolerance.
1. INTRODUCTION
As is well-known, with aim to increase reliability of calculations, it is possible to
make them by multiple ways (multi channel, multi version) and compare received re-
sults using defined criteria: predomination, majority or concordance.
This paper is devoted to two version method, known as diversity.
Diversity is a method of solving a problem (mathematical, logical, technical or oth-
er) in two (A and B) different ways (paths) with identical input data, by virtue of which a
criterion of the solution being perfect is the correspondence (in this particular case-
identity) of the obtained output results
If input data are equal the decision from both channels must be equal. But it is true
only if decisions are perfect. If there are errors in the channels then results are different.
If we compare two results, we can detect an error and (at suitable design) where (in
which channel) is it.
Henceforth, this can be reach with multi channel homogeneous methods, when
two channels equal processing is used. What is advantage of diversity, we can under-
stand, when we pose a question about errors?
The errors can be:
- a result of casual faults in the system modules – wrong function, short and
broken circuits and etc;
- systematic, when system rules and functional algorithms are violated.
First group of disturbances can be detecting by diversity and homogeneous ways.
The main advantage of homogeneous systems is in the fact that these systems are
cheaper. If we use in the homogeneous structure equal units (computers, controllers)
then faults will be appear timely in one of them. When the faults dependency of infor-
mation stream is activated, and then we will be receiving a difference of output results.
If causes for errors are parasitic signals from the supply or environment (they might
be entering by same way into two channels) they make equal, but wrong, unrecogniza-
ble by comparison results. It is possible to perceive these results as correct, which
might be danger.
Homogeneous repeatedly consecutively processing in one unit (computer, channel
and controller) can detect sporadically disturbances, because they appear in the differ-
ent casual times: one of results is wrong but another is correct. Comparison detects er-
rors.
Homogeneous methods can be use for multichannel identification to one event, ob-
ject or phenomenon. Par example, two PIR detectors detect for an intruder, but they are
situated at the different places, they “see” the intruder by different way. It is possible
one of them do not detect the intruder. If we make a disjunction of output results, the
probability for detection is higher.
Second group of disturbances (include conceptually, designer, synthetically, tech-
nologically, documental errors) cannot be detected using homogeneous methods. Sys-
tematically error repeats in all modules which are fabricated by firm technology docu-
mentation. The most effective method for error detection is diversity (in common case
N-version processing).
2. FAULT TOLERANCE PROPERTY OF DIVERSITY
Diversity can be use in two properties:
Fail-safe, it is possible to identify errors thought this instrument and cancel
system processing when system is damaged. It is need to compare results
for equivalency (Fig.1).
Processing А
Processing B
Output
result А
Output
result В
Input
Data
Comparision
ОК
А
В
Fig.1: Fail safe approach of diversity
Fault-tolerance, the goal of this instrument is to tolerate the errors and the
whole system keeps his efficiency. There are two ways:
- logic for system reservation with diversity switching (Fig.2);
Fig.2: A reservation with diversity switching
- logic with hot diversity reservation (permanently incorporated diversity res-
ervation (Fig.3).
Fig.3: A hot “diversity” reservation
3. CALCULATION THE FAULT TOLERANCE PROPERTY OF DIVERSITY
Let dual-channel system has a flow of failures with intensity λ. Then set {A} repre-
sents the faults of channel A and set {B} – the faults of channel B (see Fig.4).
Obviously, there are common and different reasons for failures in these channels.
Set {A∩B} represents common reasons. Set {A-B} and set {B-A} represent different
reasons. Set {AUB} represents all failures in the mentioned diversity system.
Fig.4: Schema of faults in one diversity system
Let Set {D} is the set of faults which don’t lead to general failure, because these
faults are elements of sets {A-B} or {B-A}. In this case there is one working channel
and system will be in availability.
(1)
)}{(}{}{ BABAD
If we suppose all faults which don’t lead to general failure (Card {D}) to all faults in
the system (Card{AUB}), we can receive an estimation to a depth of diversity Ω.
{A-B}
reasons for faults
in channel A, other
than the channel B
{A}
reasons for faults
in channel A
{B}
reasons for faults
in channel B
{B-A}
reasons for faults
in channel A, other
than the channel B
{B∩A}
reasons for faults
in channel A, common
to those of channel B
(2)
}{ )}(}{ BACard BACardBACard
(3)
}{ }{
1BACard BACard
When Card {A∩B} =0 the Ω =1 there is maximum diversity. Contrariwise, if
Card{A}=Card{B}=Card{A∩B} the depth of diversity will be zero.
For example, if both sets are particularly overlapped, and Card {A}=20, Card
{B}=30 and Card {A∩B}=10, then depth of diversity Ω will be:
(4)
75,0
}40{}10{
1 Card
Card
.
4. CONCLUSION
Drawn from both models and research done can make the following important gen-
eralizations:
1. Reliability as an essential feature of the systems may be based on diversity res-
ervation approach.
2. If depth of diversity is deeper, then reliability will be greater.
3. To determine the factors that determine the depth of diversity is necessary to ex-
amine the specific schedule for this case as well as seeking general and local causes of
failures and their intensity.
5. REFERENCES
All references will be included at the end of the paper in alphabetical order of the
surnames of the first author and must follow the bibliographic IEEE standards of jour-
nals. References should be formatted with the „Reference“ style, Arial size 12 pt, single
spacing. The rules for referencing Books and Journals are presented as follows:
[1] Popov G., Modelling Diversity as a Method of Detecting Failures in non Recovery
Computer Systems, Information Technologies and Control, 2005, N#2.pp15-19
[2] Popov G., Hristov H., Diversity as method for failure detection and tool for in-
crease reliability, Telematika, 2008,N#1,pp 65-72
[3] Hristov H., Trifonov V., Reliability and security of communications, Novi znania,
2005
[4] Martin Törngren and Jan Torin, Conceptual Design of Dependable Embedded
Control Systems,7.Oct 1998.
[5] Knight, J. C., E. A. Strunk and K. J. Sullivan. “Towards a Rigorous Definition of
Information System Survivability.”DISCEX 2003, Washington, DC, April 2003.
[6] Avizienis, A. “The N-version approach to fault tolerant software.” IEEE
Transactions on Software Engineering 11(12):1491-1501, December 1985
[7] 7. Strunk Е., Survivability in Embedded Systems, Ph.D. Dissertation, Sept. 12,
2003