A safe subset of C for automated conversion to Rust
Researchers in France have developed a subset of C that can compile to the Rust language with limited intervention.
The subset will help with the automatic translation of libraries written in C into Rust for more secure code that is less vulnerable to hacking through memory issues.
“Many critical codebases remain authored in C, and cannot be realistically rewritten by hand. Automatically translating C to Rust is thus an appealing course of action,” said Aymeric Fromherz, a researcher at INRIA in Paris.
The researchers looked at what it would take to translate C to Rust to produce code that is trivially memory safe because it abides by Rust’s type system without caveats. This has been used to port the C version of the HACL* cryptographic library into Rust for the first time. This produced a 80,000 line verified cryptographic library, written in pure Rust, that implements all modern algorithms.
“Our work sports several original contributions: a type-directed translation from (a subset of) C to safe Rust; a novel static analysis based on “split trees” that allows expressing C’s pointer arithmetic using Rust’s slices and splitting operations; an analysis that infers exactly which borrows need to be mutable; and a compilation strategy for C’s struct types that is compatible with Rust’s distinction between non-owned and owned allocations,” he said.
Alongside the HACL* cryptographic library, this approach was used with binary parsers and serializers from EverParse, and shows that the subset of C is sufficient to translate both applications to safe Rust. For the few places that do violate Rust’s aliasing discipline, automated, surgical rewrites are sufficient, he says.