Functional Safety: Predictable reactions in real-time (Part 2)
The memory seat (as described in part one) consists of hardware and software components that all have to be considered when designing and calculating the fault reaction:
- switches: belong to the user interface. Pressing a switch leads to movement until the seat reaches one of its end positions. Releasing a switch stops any movement. For this example this switch shall be connected to the digital-input-output stage, even though this is not practice.The I/O ports will be read any 500 µs. Any port with a switch requires filtering which is a up-down counter in the same timeframe.
- current sensor: is connected to the analog-digital converter and provides information about the motor current. It is part of the anti-pinch function and requires reading all 400 µs.
- speed sensor: is the second part of the anti-pinch function. In principle it counts two pulses per revolution of the motor and is connected to the compare-capture unit of the microcontroller. The associated timer will be read all 400 µs
- anti-pinch function: is a software that detects entrapment. In this case the seat moves a bit in the opposit direction. The function also runs with 400 µs period and is part of the application. The application does not have direct access to the hardware. An abstraction is implemented.
- CAN: during driving only small corrections of the positions are allowed. Therefore the system requires speed information that is provided every 20 ms. The jitter in the message timing needs to be considered. A message is lost when it is not received within 50 ms.
- Power stage: in principle it is an digital output in the IO-layer of the software.
- Power drive protection interrupt: in case of a short circuit or open load condition of the power stage an interrupt shall lock the memory seat function.
- Memory seat controller: is a state machine that takes care of all actions required. It uses all sensors and messages and drives the power stage.
Fault: Switch gets broken or a short circuit occurs during driving. In this case the seat will move without any command of the driver. The driver pushes the switch for the opposite direction to stop the movement. The broken switch situation is detected and the memory seat controller state machine will prohibit any further movement. This is also part of the timing model.
The sequence of events will be
- Switch malfunctions (Fault event at t = 0 s)
- System initiates seat to move (Failure occurs at t = 0.3 s worst case)
- Driver recognizes seat movement and is shocked (t = 1.3 s)
- Driver searches and presses the button for movement in opposite direction (t = 2.3 s)
- Seat controller detects fault and initiates transition to safe state ‘stoped seat’ (t must be smaller than 2.5 s, the emergency operation interval).
So there is a remaining time budget of 200 ms for the ECU to perform plausibility checks and debouncing operations, detect the failure and initiate the stopping of the seat. And the electric motor has to decelerate and stop within this time span too, what takes an average of 60 ms.
The design and implementation of the software and hardware performing the stopping of the seat is not simple. The following chapter will discusses how in real world development projects this has to be performed by different people from multiple companies.
1. Challenging Integration Tasks
Functional safety is the absence of unacceptable risks. The OEM has to prove that there is always a possibility to avoid damage to people under operational as well as system fault conditions. Therefore the integrator, often a 1st tier supplier, has to integrate all functions making sure non-safety functions will never interfere with safety functions, possibly causing them to fail. Besides many other aspects, functional safety makes clear that functional integration becomes even more a challenge.
An average body ECU performs functions like lighting, wipers, access and other body control functions often accompanied by communication gateway tasks. Event-based processes run in parallel to periodic tasks. Application functions are provided by the 1st tier as well as other 1st and 2nd tiers and the OEM himself. Customers expect a typical look and feel of the brand as well as features, which the OEM counts as his core competency. So the project has to handle not only a variety of applications but also multiple suppliers with different buyer-customer relationships including difficult IP issues – resulting in a multi-dimensional complexity.
Integrating non-safety and safety related functions in an ECU requires the integrator to know when what function is executed and that the performance consumed by non-safety functions will never exceed the limits for the safety function. A project has to relieve every E/E system component of potentially influencing safety related functions.
The principle is illustrated in Figure 4. The SW Component n has the highest safety integrity level ASIL C (Automotive Safety Integrity Level). Therefore this level is downed to all drivers and to the operating system. Any other application software component needs to be encapsulated regardless of its classification in safety function (here SW Component 1) or in non-safety function (here SW Component 2, which is quality assured “QA”).
Figure 4: Separation of SW Components according to their safety levels.
2. The event chain (‘Wirkkette’)
Since we calculated 200 ms to be left for any ECU activity to reach a safe state, we must describe and analyze all these possible ECU activities in a chain of consecutive events and actions – the event chain. We derive the chain elements and their timing from the fault situation ‘broken switch’ as well as from the hardware and software architecture of the system. We have to understand all possible system states (even seemingly irrelevant states) that can occur within the ECU and have to prove there will be no situation, where the safe state is reached later than in 200 ms.
One has to bear in mind, that all the actions in an ECU are performed not immediately after a state has been reached. They are often executed with lower priority, meaning delayed dependent on the states and actions of other functions. Or they have a certain execution period, resulting in a jitter of maximum one period’s time. So the event chain will (nearly) never just be executed in the time all actions take (summed up statically) along the chain.
Our critical event chain will start with the debouncing of the switch input signal and reading the according I/O port. Since the I/O read is performed periodically in one task, and the calculation of the seat controller application in another task, the data is not passed and processed immediately. The state machine of the seat controller considers various input signals and may delay its reaction because of waiting for an input signal e.g. the vehicle speed via CAN. And the state machine shares the CPU time with various other application and operating system processes. We will not detail out the event chain in this paper, but it is modeled and analyzed very exact for the functional safety validation.
Since modern body ECUs execute software of various functions, often from multiple development teams and possibly from different suppliers, the event chain has segments from many developers. It becomes a team effort to compile the end-to-end event chain and by this improves the collaboration among the involved suppliers. Developers responsible for a fraction (e.g. communication stack) get access to the overall picture, gain a better system understanding . They get a more detailed description of the output of preceding functions and can easily evaluate what the impact of their design decisions to the complete system is.
3. Modeling timing and critical event chains
Using a model based approach we will be able to express the system’s behavior and inspect the reactions to the faults way earlier than using prototypes. To be able to model and analyze the event chain we will use timing models, which will be simulated and validated using the INCHRON Tool-Suite.
The system is modeled either using the modeling GUI of the Tool-Suite or by annotating the timing and performance relevant parameters in a Rhapsody UML or PREEvision system model . The model describes resources (CPUs and buses), tasks to be executed by the resources (Processes, Runnables, OSEK-Tasks, bus messages), the scheduling mechanisms, the stimulations to the system (IRQs, bus messages) and the execution times of the tasks and OS functions (see Figure 5). For further details on the modeling see  and .
Figure 5: System model with timing annotations. For full resolution click here.
The event chain is modeled in the timing model as well. Every task along the chain is activated by certain events and will themselves activate other tasks. To be able to calculate timing along the chain the analysis incorporates all events and activations as well as consecutive steps including data flow in particular. If e.g. a bus message or a variable is updated periodically and contains an important data for the event chain element, it is essential to consider which iteration of the message or variable update held the data. It is important if asynchronous readouts ‘miss’ the data or read the same data inadvertently twice.
In an ECU with multiple functions and applications several event chains interfere with each other. In our example the seat control runs on an ECU with other functions, which partly may have safety aspects as well. Therefore it is crucial to identify the critical event chains for the safety critical application parts. And this may become a challenge. One has to identify what functions are preconditions to the change of system state to ‘safe state’. Is in our example the speed information received via CAN obligatory (i.e. part of the critical event chain) to detect within the ‘driving state’ the fault on the switch? The safety analysis has to include all possibilities and find the critical functions and states. An examination of the complete system is mandatory and has to include timing analysis on the critical event chains. Even if the ECU is ‘fast’ compared to human reactions and mechanic parts – it has to be proven that it is fast enough under all conditions.
Figure 6: Asynchronous processing leads to real-time violations. For full resolution click here.
Figure 6 shows a configuration of functions in principle. If the processing of the function blocks takes place in asynchronous, periodic tasks, drifting clocks with have the effect, that the identical event chain is executed sometimes fast enough, sometimes not. Data misses it’s periodically receiving function and will wait an iteration to be processed further. Other reasons for Jitter and sometimes even lost data or activations are bursts of interrupts e.g. from bus communication or when higher priority tasks preempt or lower priority tasks block the execution of the critical functions along the event chain. The latter priority inversion may happen when the low priority task is not preemptable or if a semaphore is not released properly in time.
An analysis of task and event chain timing will reveal how the system performs the scheduling of functions and how critical data is processed. A validation finds the best and worst case timings and system reactions. Simulation reveals individual system behavior and allows statistical analysis (Figure 7) as well as understanding the detailed system behavior. The INCHRON Tool-Suite is offers both, simulation and mathematical validation together with a variety of diagrams and reports representing different views on the system’s behavior.
Figure 7: Process latency distribution of an OSEK task. For higher resolution click here.
The introduction of functional safety to already complex embedded systems is a challenge for all involved parties. Especially the integrator has to have a detailed knowledge of the planned system’s behavior to design an architecture capable to fulfill the functional safety requirements. Timing and performance analysis of functions and event chains have to be an integral part of the safety process. The dynamic system behavior needs to be considered by all project parties from the early specification down to the validation of its correct implementation. And it requires a close, tool supported collaboration between safety engineers and system architects.
The paper describes the basic steps along the safety process and highlights how timing and performance are key factors. Mechanical design, electronics and software will have a common architectural blueprint to derive critical event chains and timing budgets. The earlier, because model based, this can be tested, the faster a robust architecture can be found and the safety goals can be achieved. The example demonstrated that even small functions aren’t trivial especially when they rely on complex integrated systems. Nevertheless it was shown that the modeling of timing and performance is feasible with the right methods and tools.
 ISO 26262: Functional Safety – Road Vehicles
 Standish Group, "CHAOS", www.standishgroup.com/chaos.html
 IBM Rational Rhapsody copyright IBM International Business Machines Corp.; PREEvision copyright aquintos GmbH
 Fernlehrgang ISO 26262; J. Belz, Prof. M. Broy, Dr. G. Glöe, C. Jung, S. Kovacevic, M. Maihöfer, J. Philipps, A. Reuter; www.car-training-institute.com∕iso26262
 B. Augustin, Audi AG; Integration of previously independent control units and functions – challenges of a collaboration project; 2. Fachkongress Echtzeitentwicklung 2010; www.echtzeitkongress.de
 T. Kramer, R. Münzenberger; New Functions, New Sensors, New Architectures – How to Cope with the Real-Time Requirements; In proceedings of Advanced Microsystems for Automotive Applications 2009, Berlin; www.amaa.de; ISBN 978-3-642-00744-6
 A. Wolfram and M. Makarov Continental AG, T.Kramer, W. Ramisch, R. Münzenberger; Design of Robust System Architectures for Automotive ECUs; In proceedings of Conquest 2009, Nuremberg; www.isqi.org/konferenzen/conquest/2009/
Dr. Ralf Münzenberger (Muenzenberger@INCHRON.com) is Co-Founder and Managing Director Professional Services of INCHRON GmbH, Potsdam, Germany
Dipl.-Ing. Tapio Kramer (Kramer@INCHRON.com) is Marketing and Product Manager at Inchron GmbH, Garching, Germany
Juergen Belz (firstname.lastname@example.org) is CEO of Prometo GmbH, Paderborn, Germany.