Our experience in Safety-Critical Systems

Software Testing - Static Analysis - Safety Case Development - Formal Verification

Software Testing is crucial to deploy secure and safe systems

We have significant experience in using Cantata, one of the most widely used software testing frameworks in the global market for safety and mission-critical systems. Cantata is certified for all major industrial & safety critical standards.

Software Testing, Static Analysis, and Safety Case development.

We will develop them under full compliance with your safety standards (e.g. IEC 61508). Please see our experience below.

Our Industry and Scientific Research Experience in Safety-Critical Systems

Highways Industry Sector — A safety-related, real-time embedded system with safety integrity level 2

The system was used to control and monitor traffic signals in and around tunnels. We provided support for building the safety-case and the formal verification. To carry out the formal verification of the system, test cases were designed and defined at different levels (module, integration and system) according to the specifications of the system and fully complying with the safety standard IEC 61508. The Cantata testing tool was used. Cantata is a well-known commercial tool specialised on safety-related systems for performing static and dynamic testing. Cantata was used to perform dynamic testing, achieving 100% coverage on statements and 100% coverage on conditional statements. We also used the automated software static analysis tool QAC++ to check adherence to the High Integrity CPP standard. Dynamic testing with the use of the Cantata testing tool included module testing and integration testing as well as coverage analysis. This project was developed at Simulations Systems Ltd, Bristol, UK.

Safety System Research Centre (SSRC), University of Bristol, UK — various projects

Statistical Testing

Interest in Statistical Software Testing (SST) is growing because it provides a software assurance technique that is both sound and practical. In common with formal proof methods, it is one of the few techniques that offers an objective, quantitative measure of software quality (SST provides a failure probability estimate). In addition, SST side-steps the famous statement "Program testing can be used to show the presence of bugs, but never to show their absence!". The statement is true, but assumes that a total absence of failure is the only acceptable goal. This assumption is not made in other engineering disciplines, where the goal is risk reduction and complete absence of system failures is seen as unrealistic.

Statistical software testing is a dynamic testing technique, designed in a very specific way that makes it possible to assess the system’s probability of failure in terms of, for example, number of failures per hour. No other testing technique allows us to do this. Such a probability may be used as evidence in a safety case or as stand alone assurance for the software under test.

The core conditions for statistical testing are that statistical test cases have to be (a) generated through a probabilistic simulation of the application environment for which the dependability statement needs to be derived, and (b) test cases have to be statistically independent.

At the SSRC, we applied SST to various projects of multiple industry sectors. In one of these projects, SST was applied to a world-wide used, industry smart sensor manufactured by Moore Industries. The smart sensor design was based on a Motorola HC12 processor and is used in safety-critical systems. The actual microprocessor and serial communication links were unavailable to us. Thus, it was necessary to simulate the microprocessor used in the actual device as well as serial communication links, and also permanent storage facilities and configuration of the device. Furthermore, we needed a facility that allows us to read inputs from a file and then simulate them as data signals coming in from the analog-to-digital (A/D) converter at specific time intervals. This made it possible to generate inputs following the probabilistic rules, store them in a file, and process them as a continuous data stream. In the same way, we needed to have a facility to write output to a file so that the output could later be checked for software passes and fails. We assessed two different commercial tools to assist us in this. We performed some test runs on example random transients and gained some feedback on potential issues around testing the firmware. This would be an important step, since again it would provide guidelines for the generation of statistical test-cases and setting up suitable test-harnesses for statistical testing of smart device firmware. The Cantata testing tool was used to develop this test-harnesses.

Furthermore, we designed and implemented a testing strategy (aka oracle) based on software requirements specifications.

Probabilistic Graphical Models for Application of Safety Standards in Compliant Systems Development

Safety-critical software development processes are developed based on safety standards such as IEC61605-3, DEF-0055 or DO-178. IEC61508 is a well established standard in the civil industry. The standard is intended to serve as basis for the preparation of more sector specific standards such as IEC61511, or for stand-alone use where no more specific sector standard or industry code of practice exists. IEC61508 provides requirements for, and guidance on, developing programmable systems for protection and safety-related control that implement safety functions of sufficient integrity to reduce to an acceptable level the risk arising from identified hazards. Examples of protection and safety-related control systems include: a nuclear power station protection system; a railway traffic management system; a machinery control system; an offline advisory system whose output contributes to a safety decision. A safety function implemented by a safety-related system can comprise a software element, a hardware element or both. The integrity of a safety function is a measure of the confidence that the safety-related system will satisfactorily perform the specified safety function when required. The integrity is expressed as a Safety Integrity Level (or SIL) in the range 1-4 where 4 is the most demanding. In determining safety integrity, all causes of unsafe failures are taken into account: random hardware failures, and systematic failures of both hardware and software. Some types of failure such as random hardware failures may be quantified as failure rates, while factors in systematic failure often cannot be accurately quantified but can only be considered qualitatively. A qualitative SIL requirement (such as the competency of the development team, industry application factor, types of verification methods to employ and the intensity of their application, etc.) is interpreted as the degree of rigour with which recommended system development techniques should be applied in order to achieve the required confidence in the correct operation of the associated safety function.

The use of development standards is common in current good practice. Software safety standards recommend processes to design and assure the integrity of safety-related software. However the reasoning on the validity of these processes is complex and opaque. In this project we used Graphical Probability Models (GPMs) to formalise the reasoning that underpins the construction of a Safety Integrity Level (SIL) claim based upon a safety standard such as IEC61508. There are three major benefits: (1) the reasoning becomes compact and easy to comprehend, facilitating its scrutiny, and making it easier for experts to develop a consensus using a common formal framework; (2) the task of the regulator is supported because to some degree the subjective reasoning which underpins the expert consensus on compliance is captured in the structure of the GPM; (3) users will benefit from software tools that support implementation of IEC61508. Such tools even have the potential to allow cost-benefit analysis of alternative safety assurance techniques.

The proposed prototype bayesian belief network (BBN) structure introduces a novel way to capture the effects that interactions between phases of a standard have on integrity claims. The experimental results provided an indication of the effectiveness of applying BBNs models in safety-related systems.

Fault tolerant techniques to improve software reliability

The goal of this research was twofold: to investigate the enhancement of software reliability and to assess the potential use of fault tolerance in safety arguments for systematic failures. We intended to devise the principles of an on-line diagnostic approach to fault tolerance (FT) that integrates data diversity assertions and traditional assertions (data diversity as a complement to design diversity). This FT technique focused on the problem of detecting residual design errors that appear at execution time (after verification and validation) as systematic failures. The latency can be long in normal operation and only becomes apparent under specific conditions associated with particular combinations of inputs and internal system states. On-line diagnosis is the only technique available to mitigate such residual design errors after analysis, design and testing.

We used data diversity (DD) techniques as a potential solution to the problem of systematic failures in safety-critical software. An assessment strategy demonstrated its ability to compare the effectiveness of these techniques to a typical FT approach based on traditional assertions (TA). A set of metrics were defined and used for the comparison. We drew the following conclusions. DD and TA showed some orthogonality (trapped different faults). This finding was based on particular mutant groups. For example in Constant Mutation, we observed that these types of mutations were special in their ability to cause sporadic outbound memory access. For faults in conditional statements, DD was more effective than TA but this is easily explained in terms of the type of faults that each was designed to cover. We observed that DD was more effective at fault trapping that occurred on either if-statements or faults that occurred in non conditional statements in which the behaviour produced by these faults made the control flow change. This is because DD takes the computation out of the failure region. DD sometimes still failed to detect faults that changed the control flow of computations.

When we introduced the calculation of expected paths in the design of DD, we were able to observe a slight increase in its effectiveness. As expected, confirming that a expected path has been taken does not guarantee that the output of certain computation will be correct. The fact that the number of paths can grow massively as the size of a software development increases could be a serious limitation of the proposed use of path information in FT. However, one could argue that in safety-critical applications it is often possible to identify highly critical functions where complexity and the size of the function is relatively small, for example 50 to l00 lines of code.

Our data-diversity fault tolerance approach was implemented and assessed using a safety-critical nuclear protection system called DARTS. DARTS stands for Demonstration of Advanced Reliability Techniques for Safety Related Computer Systems. DARTS was implemented by the nuclear industry, using an industry strength development process. DARTS software is suitable for data diversity as its inputs are based on readings from sensors. Sensors typically provide noisy and imprecise data; therefore small modifications to those data would not adversely affect the application and can be suitable for implementing FT.

The DARTS software was written in the C language for a Nuclear Power Plant, with a Steam Generating Heavy Water Reactor. The plant has an extensive range of protection systems whose configuration is based on parameters from both the nuclear and the conventional parts of the plant.

The DARTS software takes as inputs the level of neutron power, the steam pressure and water level in steam drums, and produces output based on these three levels which informs the user whether the status is Normal, Warning or Trip. A warning occurs when the levels are within 2% of the trip levels, and a trip occurs if any of these parameters go outside predefined ranges.

In addition to demonstrating the feasibility of the experimental approach, our results did support the hypothesis that multiple diverse FT techniques can detect different types of faults, and it is therefore plausible that this is an approach that could be usefully employed to improve reliability.

Page updated

Report abuse