Edmund Smith
August 09, 2023
Reading time:
Rust is a modern language known for its memory safety, efficiency, and wide range of high-level features. But many beginners also run into something else in Rust: how surprisingly difficult it is to represent some common designs. The internal sense we all have of what kinds of things will lead to difficult code doesn't work well at all in Rust for most newcomers, especially when that sense has been trained by other languages, like C++.
In this series of blog posts, we'll go through a Rust crate which can simplify the creation and management of interconnected objects. It's not a complete solution, and we'll discuss its limitations in detail in future parts, as well as what it might take to deal with these problems more comprehensively.
Let's think about a toy problem, so simple it may not even seem worth discussing at first. We have two types of objects, users and groups, and some links between them. Let's say we're going to build up a store of them in memory. Maybe it's part of a slightly beefed-up authentication system we're writing.
In C++, we might start with something like this:
class group; class user { std::vector<std::shared_ptr> groups; }; class group { std::optional<std::shared_ptr> leader; };
Things are not completely straightforward even here, and we'll probably end up with weak pointers in the mix somewhere in the final design, but it's a starting point that we can build code upon.
There's nothing directly analogous to this in Rust, for reasons we'll explore later. We might try something like this:
struct User { groups: Vec<Arc>, } struct Group { leader: Option<Arc>, }
Immediately, we will run into problems. An Arc
does not provide mutable access to its contents once there are multiple pointers to the same data.
Suppose we have one user, u
, and one group g
, where u
is the leader of g
. We might try:
let g = Arc::new(Group { leader: None }); let u = Arc::new(User { groups: vec![g.clone()] });
At this point we're stuck, we can't edit g
to make u
the leader. If we flip things around, it's not any better:
let u = Arc::new(User { groups: Vec::new() }); let g = Arc::new(Group { leader: Some(u.clone()) });
Now we can't go back and edit u
to mark them as a member of g
. One way or another our representation is incomplete.
Rust aficionados might note that if we were really careful with ordering, and also avoided theclone
, then we could muddle through this example usingget_mut
on Arc, which only works when a single copy of the Arc exists. But this approach would become increasingly impractical as we add more users and groups, or if the model were to become more complex.
We'll explore other approaches in later parts of this series, but none of them safely solves this basic case satisfactorily. We'll talk about why that is in detail - why this is fundamentally hard in Rust - in the next part. In short, it's because Rust's safety comes in part from controlling data access patterns and an object soup makes that very difficult. But to skip to the subject of this series: here's a viable alternative using persian-rug
:
#[contextual(Rug)] struct User { groups: Vec<Proxy>, } #[contextual(Rug)] struct Group { leader: Option<Proxy>, } #[derive(Default)] #[persian_rug] struct Rug(#[table] User, #[table] Group);
So what is all of this? And what's this Rug
? The basic idea of persian-rug
is to control access to the whole soup with a single object. To do this, the single object - the rug - owns all the memory, and we refer to objects with lightweight links called proxies. You can think of a proxy as a handle which requires a reference to the rug in order to obtain a reference to the object. The macros you see in the above, like contextual
and persian_rug
, just add the supporting code you need to make this all work conveniently, without too much additional noise.
The above definitions give us a collection of interlinked objects which are fully mutable, and can be built conveniently. For example, here's a working version of the code we attempted with Arc
using them:
let r: Rug = Default::default(); let u = r.add(User { groups: Vec::new() }); let g = r.add(Group { leader: Some(u) }); r.get_mut(&u).groups.push(g);
Compared to our earlier attempt, this version has no additional construction order dependencies. By holding all the data in the rug, we're allowing Rust to treat the entire collection of objects as a thing to which access can be granted. That means we sidestep the problem of individual objects participating in a web of direct references to others. Rust's normal safety rules can be based on examining references to the rug, and most safety checking can happen at compile time.
In the following parts, we'll start by going into the details of why these soups are difficult to manage in Rust, and the built-in solutions you might try without persian-rug
. Then we'll look at the details of how persian-rug
works, and finally finish by looking at the limitations currently present, compared with other third-party solutions, and how those limitations might one day be improved.
19/12/2024
In the world of deep learning optimization, two powerful tools stand out: torch.compile, PyTorch’s just-in-time (JIT) compiler, and NVIDIA’s…
08/10/2024
Having multiple developers work on pre-merge testing distributes the process and ensures that every contribution is rigorously tested before…
15/08/2024
After rigorous debugging, a new unit testing framework was added to the backend compiler for NVK. This is a walkthrough of the steps taken…
01/08/2024
We're reflecting on the steps taken as we continually seek to improve Linux kernel integration. This will include more detail about the…
27/06/2024
With each board running a mainline-first Linux software stack and tested in a CI loop with the LAVA test framework, the Farm showcased Collabora's…
26/06/2024
WirePlumber 0.5 arrived recently with many new and essential features including the Smart Filter Policy, enabling audio filters to automatically…
Comments (0)
Add a Comment