| Ariane 5's first test flight on June
4, 1996 failed, with the rocket self-destructing 40 seconds after launch because of a
malfunction in the control software, resulting in one of the most expensive computer bugs in history.
The short version
The Ariane 5 software reused the specifications from the Ariane 4, but the Ariane 5's flight path was considerably different
and beyond the range that the reused code had been intended for. In addition, pre-flight tests never tested the re-alignment code
under simulated Ariane 5 flight conditions, so the error was not discovered before launch.
Because of the different flight path, a data conversion from 64-bit floating point to 16-bit signed
integer value had caused an hardware
exception (operand error). The floating point number had a value too large to be represented by a 16-bit signed integer. Efficiency considerations had led to the
disabling of the software handler (in Ada code) for
this trap, although other conversions of comparable variables in the code remained protected. This led to a cascade of problems,
culminating in destruction of the entire flight.
Full report
This excerpt is taken from Report by the Inquiry Board (http://www.dcs.ed.ac.uk/home/pxs/Book/ariane5rep.html)
- 3. CONCLUSIONS
- 3.1 FINDINGS
- The Board reached the following findings:
- ...
- e) At 36.7 seconds after H0 (approx. 30 seconds after lift-off) the computer within the back-up inertial reference system,
which was working on stand-by for guidance and attitude control, became inoperative. This was caused by an internal variable
related to the horizontal velocity of the launcher exceeding a limit which existed in the software of this computer.
- f) Approx. 0.05 seconds later the active inertial reference system, identical to the back-up system in hardware and software,
failed for the same reason. Since the back-up inertial system was already inoperative, correct guidance and attitude information
could no longer be obtained and loss of the mission was inevitable.
- g) As a result of its failure, the active inertial reference system transmitted essentially diagnostic information to the
launcher's main computer, where it was interpreted as flight data and used for flight control calculations.
- h) On the basis of those calculations the main computer commanded the booster nozzles, and somewhat later the main engine
nozzle also, to make a large correction for an attitude deviation that had not occurred.
- i) A rapid change of attitude occurred which caused the launcher to disintegrate at 39 seconds after H0 due to aerodynamic
forces.
- ...
- m) The inertial reference system of Ariane 5 is essentially common to a system which is presently flying on Ariane 4. The
part of the software which caused the interruption in the inertial system computers is used before launch to align the inertial
reference system and, in Ariane 4, also to enable a rapid realignment of the system in case of a late hold in the countdown. This
realignment function, which does not serve any purpose on Ariane 5, was nevertheless retained for commonality reasons and
allowed, as in Ariane 4, to operate for approx. 40 seconds after lift-off.
- n) During design of the software of the inertial reference system used for Ariane 4 and Ariane 5, a decision was taken that
it was not necessary to protect the inertial system computer from being made inoperative by an excessive value of the variable
related to the horizontal velocity, a protection which was provided for several other variables of the alignment software. When
taking this design decision, it was not analysed or fully understood which values this particular variable might assume when the
alignment software was allowed to operate after lift-off.
- o) In Ariane 4 flights using the same type of inertial reference system there has been no such failure because the trajectory
during the first 40 seconds of flight is such that the particular variable related to horizontal velocity cannot reach, with an
adequate operational margin, a value beyond the limit present in the software.
- p) Ariane 5 has a high initial acceleration and a trajectory which leads to a build-up of horizontal velocity which is five
times more rapid than for Ariane 4. The higher horizontal velocity of Ariane 5 generated, within the 40-second timeframe, the
excessive value which caused the inertial system computers to cease operation.
- q) The purpose of the review process, which involves all major partners in the Ariane 5 programme, is to validate design
decisions and to obtain flight qualification. In this process, the limitations of the alignment software were not fully analysed
and the possible implications of allowing it to continue to function during flight were not realised.
- r) The specification of the inertial reference system and the tests performed at equipment level did not specifically include
the Ariane 5 trajectory data. Consequently the realignment function was not tested under simulated Ariane 5 flight conditions,
and the design error was not discovered.
- ...
- t) Post-flight simulations have been carried out on a computer with software of the inertial reference system and with a
simulated environment, including the actual trajectory data from the Ariane 501 flight. These simulations have faithfully
reproduced the chain of events leading to the failure of the inertial reference systems.
- 3.2 CAUSE OF THE FAILURE
- The failure of the Ariane 501 was caused by the complete loss of guidance and attitude information 37 seconds after start of
the main engine ignition sequence (30 seconds after lift- off). This loss of information was due to specification and design
errors in the software of the inertial reference system.
- The extensive reviews and tests carried out during the Ariane 5 Development Programme did not include adequate analysis and
testing of the inertial reference system or of the complete flight control system, which could have detected the potential
failure.
Aftermath
Flight 501's high profile disaster brought to the attention of the general public, politicians and executives the high risks associated with
complex computing system, of which it is now a classical example, resulting in increased support for research on ensuring the
reliability of safety-critical computer systems. The subsequent automated analysis of the Ariane code was the first example of large-scale static analysis by abstract
interpretation.
The failure also destroyed some of the excellent record of the European Space Agency's rocket family, previously gained by the
stellar success rate of the Ariane 4 model. Only recently have Ariane 5 launches
been as reliable as those of the predecessor model.
External link
- Ariane 5 - 501 (1-3) (http://www-aix.gsi.de/~giese/swr/ariane5.html) - A good article (in German) where the actual code in question is given
|