Static analysis tools will be help the development of embedded systems using the Rust language, says Fabrice Derepas, Co-founder & Chief Evangelist of TrustInSoft in Paris, France.
C and C++ continue to dominate the embedded-systems landscape. But developers are equally aware of the way in which the use of these languages can lead to problems during development. Their treatment of pointers and similar objects can lead to serious memory-safety issues.
Rust offers a similar syntax but shows great promise as a language that provides much of the flexibility of these older languages but with much stronger guarantees of safe operation. Rust uses elements pioneered in functional languages and other advanced concepts that are now taught to young software developers at college.
But an important factor behind the growth in Rust support is that it overcomes many of the memory-related problems encountered by C and C++ programmers, and the users of their code. These features are helping Rust become a strategic choice for the development of new software modules in high-criticality systems, such as those found in the automotive, industrial, and other sectors.
Rust is gaining broad support with several of the largest software companies in the world are already major users of Rust, thanks to its emphasis on reliability and memory safety.
At the end of 2022, Rust became the first language to be supported by the Linux community, alongside C, for the development of kernel modules. Rust gained further momentum earlier this year with the publication of a report by the US White House Office of the National Cyber Director that called for the use of memory-safe language, such as Rust, to guard against cyber-attacks.
Memory issues
A significant difference between C/C++ and Rust lies in the treatment of pointers. The pointer provides a low-overhead way to manipulate data in memory. But the ease with which pointers can be created and modified by a C or C++ program makes them risky to use.
For example, a function in a program may define a pointer access memory in a temporary buffer allocated by the operating system. Attempts to use that pointer if another part of the code has already deallocated the memory will probably result in data corruption. The failure of the program may well follow in short order. Similarly, a null pointer that has been defined but not initialised correctly should generate a memory-access fault when used. This will often also lead to a program failure.
Programs can also fail to deallocate memory when it is no longer needed. In a long-running program, such memory leaks will lead to system instability when the operating system cannot find any free memory to allocate to additional objects.
There are other risks. If the address a pointer contains strays further than the bounds of the memory allocated to data structure or buffer the result is often dangerous data corruption. Hackers exploit this property with buffer-overflow attacks. Illegal writes made through buffer overflows remained the most common vulnerability in the 2022 edition of Mitre’s Common Weakness Enumeration (CWE).
A safer choice
Rust users can avoid many of these and other memory-safety issues by taking advantage of its strict rules and built-in support for memory allocation. Compile-time checks help guarantee the correct behaviour of references. Rust has datatypes that provide pointer-like features, but which are supported by compile-time checks. These checks help prevent the issues encountered with C-like pointers.
The memory model supported by Rust also ensures that temporary memory structures will be safely deleted once they are no longer required by the program. Importantly for real-time systems, there is no need to run a garbage-collection process in the background.By providing memory-safe structures and manipulation techniques, Rust can speed up the development and testing of software. And it leverages the skills that college-educated developers now learn. These two factors are important in sectors such as automotive, where the software content of vehicles is growing rapidly.
The need for legacy
However, reuse of existing code modules is equally important to organisations developing high-criticality systems. Safety-critical development needs to be conservative. Changes should be made to existing systems only where necessary. It is impractical and even undesirable to rewrite modules in a new language, even if its protection mechanisms offer significant advantages over legacy C or C++. These existing modules will need to be verified once integrated into a target that includes large portions written in Rust or a similarly memory-safe language.
There are also situations where, even if a language does offer strong behaviour guarantees, engineers need to perform additional checks to ensure safety. This is particularly true of embedded control. Many low-level interactions, such as accesses to memory-mapped hardware registers or data buffers, cannot easily be performed using Rust’s references and similar memory-safe elements. Programmers can manage these interactions in Rust by using raw pointers. These pointers behave similarly to those provided by C and C++. But without Rust’s safeguards they need additional checks.
Competition to convert legacy C code to Rust automatically with GenAI
Advanced static analysis
The static analysis performed by the Rust compiler is conservative and inevitably so because of limits on the depth of analysis it can perform. Code that dereferences a raw pointer in Rust will trigger a compile error unless developers encapsulate that code within unsafe{} blocks. This marking tells the compiler to not perform its usual safety checks when compiling this code. As a result, it provides no guarantees of memory safety.
There are other situations where developers need to use unsafe{} blocks. Without them, the compiler will disallow any calls to unsafe functions or methods, as well as code that attempts to access the fields of unions. Though unions are not a core Rust feature, support for unions can be important to provide compatibility with C and C++. The compiler cannot guarantee the safety of operations on any of the fields because it cannot determine whether writes to one field will or will not corrupt the other fields that share the same memory structure.
There are many situations where unsafe markings are required in native Rust code. Within the general-purpose library that Rust uses, roughly 30% of the packages within the collection at crates.io use the unsafe{} construct. A compiler cannot check the safety of operations within these packages.
However, formal verification and mathematical-reasoning techniques exist that make it possible to analyse code before execution. They can determine whether code will suffer from memory-safety issues such as buffer overflows, null-pointer accesses and other problems. Static-analysis techniques are exhaustive in a way, providing guarantees that even extensive dynamic testing cannot.
Developers are working on tools for Rust-based that can highlight detected problems, indicating where programmers need to insert additional safeguards such as checks on the address range of a pointer reference. As Rust will often need to coexist with C and C++ modules from earlier development, tools like TrustInSoft Analyzer will be used to ensure the combined codebase is free of memory issues.
Even in modules that rely fully on the memory-safety features of Rust and which do not include code inside unsafe{} blocks, it will be important to test program behaviour exhaustively before deployment. Errors that are caught by Rust at runtime will often terminate the program completely, which is not acceptable in high-criticality systems. Static tools can examine the likelihood of these situations occurring and warn the development team so that any problems are fixed before product release.
There are also some behaviours which are defined but not desired. For instance, a division by zero or some errors can trigger a “panic” state, which could lead the system to crash. Detecting all of these unwanted behaviours is also key in using an advanced static analyser.
Verification of correct behaviour can be further augmented by automatically generating assertions to test whether code written in any language can handle unexpected or out-of-range inputs safely. This kind of testing emulates the “fuzzing” that is often used by hackers to identify vulnerabilities.
Such tools will provide formal, verifiable proof of the absence of memory safety vulnerabilities that could cause safety-critical vehicles to behave unpredictably and dangerously. To prevent developers being overwhelmed with potential errors, there are tools that have been designed to limit the number of false positives to ensure they only point to code that is likely to suffer from memory-safety issues.
As Rust continues to make further inroads into high-criticality systems development, there will always be a need to verify that external code modules and low-level functions do not have latent issues that will disrupt operations in the field. Using additional static testing and verification ensures that developers will catch and fix undefined behaviours early in the integration cycle, and long before deployment.