Question
Managing Object Lifetimes Safely in Rust Garbage Collector APIs
Question
I am building a simple mark-and-compact garbage collector in Rust. The internal implementation is acceptable, but the public API is fundamentally unsafe, and I want to redesign it so object validity is expressed more clearly through the type system.
Here is the current shape of the API:
/// Describes the internal structure of a managed object.
pub struct Tag {
// ...
}
/// An unmanaged pointer to a managed object.
pub type Pointer = *mut usize;
/// Mapping from old object addresses to new object locations.
pub type Adjust = std::collections::BTreeMap<usize, Pointer>;
/// Mark this object and anything it points to as live.
pub unsafe fn survive(ptr: Pointer);
pub struct Heap {
// ...
}
impl Heap {
pub fn new() -> Heap {
// ...
unimplemented!()
}
/// Allocate an object with the specified structure.
pub fn allocate(&mut self, tag: Tag) -> Pointer {
// ...
unimplemented!()
}
/// Move all live objects from `heap` into `self`.
pub unsafe fn reallocate(&mut self, heap: Heap) -> Adjust {
// ...
unimplemented!()
}
}
There are two important rules I want the API to model:
- All pointers to objects allocated in a
Heapbecome invalid when that heap is merged into another heap. reallocatereturns anAdjustwhose values are valid pointers intoself.
I considered replacing raw pointers with a lifetime-based handle type:
use std::collections::BTreeMap;
use std::marker::PhantomData;
use std::sync::atomic::AtomicUsize;
#[derive(Copy, Clone)]
pub struct Object<'a> {
ptr: *mut AtomicUsize,
mark: PhantomData<&'a usize>,
}
impl<'a> Object<'a> {
pub fn survive(self) {
// ...
}
}
pub type Adjust<'a> = BTreeMap<usize, Object<'a>>;
pub struct Heap {
// ...
}
pub struct Allocator<'a> {
// ...
}
impl Heap {
fn allocator<'a>(&'a self) -> Allocator<'a> {
// ...
unimplemented!()
}
// This does not work well:
// fn allocate<'a>(&'a mut self, tag: Tag) -> Object<'a>;
// fn reallocate<'a>(&'a mut self, heap: Heap) -> Adjust<'a>;
//
}
<> Allocator<> {
(& , tag: Tag) Object<> {
()
}
(& , heap: Heap) Adjust<> {
()
}
}
Is this kind of design correct for expressing object lifetime and invalidation in Rust? If not, what should be changed?
Short Answer
By the end of this page, you will understand how Rust lifetimes relate to ownership and borrowing, why they cannot directly express moving garbage-collected objects, and how safer APIs are usually built with handles, borrowing scopes, or indirection instead of raw pointers. You will also see practical Rust patterns for designing a heap API that prevents invalid references after compaction or heap merging.
Concept
Rust lifetimes describe how long references are allowed to be used, but they do not describe arbitrary semantic validity rules such as “this pointer becomes invalid after a heap compaction.” That is the core issue behind this question.
In a moving garbage collector, objects can change address. If your API exposes raw pointers, then compaction or heap merging can make previously returned pointers invalid. Rust's type system can help, but only if the API exposes values whose validity is tied to Rust's borrowing rules.
The key idea
A lifetime like Object<'a> only says:
- this value is connected to something borrowed for
'a - the compiler will not let it outlive that borrow
It does not automatically mean:
- the pointed-to object will remain at the same memory address
- the object cannot be moved internally
- the handle is still valid after some unrelated mutation
That means a type like this:
pub struct Object<'a> {
ptr: *mut AtomicUsize,
mark: PhantomData<&'a usize>,
}
only pretends to be lifetime-aware. The compiler sees the PhantomData, but the raw pointer itself is still just a raw pointer. If the heap compacts and moves the object, the pointer becomes stale. The lifetime does not fix that.
Why this matters
In real Rust APIs, if data may move or be invalidated by later operations, developers usually avoid returning direct pointers. Instead they return one of these:
Mental Model
Think of the heap like a warehouse and objects like boxes inside it.
- A raw pointer is like writing down the exact shelf location of a box.
- A moving GC can reorganize the warehouse and move boxes to different shelves.
- After reorganization, your written shelf location may point to the wrong place.
A Rust lifetime is not a magical tracker for moving boxes. It is more like a rule saying:
“You may ask the warehouse manager for a box location only while you are standing at the desk.”
If the warehouse reorganizes after that, the old shelf note is no longer trustworthy.
A safer design is to give users a ticket number instead of a shelf location.
- The ticket number stays meaningful.
- The warehouse manager looks up the current shelf location when needed.
- If the box was removed or moved, the manager can detect that.
In Rust terms:
- pointer = direct address, fragile when objects move
- handle/id = stable identity, safer across internal movement
- borrowed reference = temporary access while the owner guarantees stability
Syntax and Examples
A fragile design: returning raw pointers
pub type Pointer = *mut usize;
impl Heap {
pub fn allocate(&mut self, tag: Tag) -> Pointer {
// returns direct address into the heap
unimplemented!()
}
}
This is unsafe for a moving collector because later compaction may relocate the object.
A safer design: return a handle
Instead of exposing the address, expose a stable object identifier.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub struct ObjectId(usize);
pub struct Heap {
objects: Vec<String>,
}
impl Heap {
pub fn new() -> Self {
Self { objects: Vec::new() }
}
(& , value: ) ObjectId {
= (.objects.());
.objects.(value);
id
}
(&, id: ObjectId) <&> {
.objects.(id.).(::as_str)
}
}
() {
= Heap::();
= heap.(.());
(, heap.(id));
}
Step by Step Execution
Consider this small example:
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
struct ObjectId(usize);
struct Heap {
objects: Vec<String>,
}
impl Heap {
fn new() -> Self {
Self { objects: Vec::new() }
}
fn allocate(&mut self, value: String) -> ObjectId {
let id = ObjectId(self.objects.len());
self.objects.push(value);
id
}
fn get(&self, id: ObjectId) -> Option<&String> {
self.objects.get(id.0)
}
}
fn main() {
let = Heap::();
= heap.(.());
= heap.(.());
(, heap.(a));
(, heap.(b));
}
Real World Use Cases
Compilers and interpreters
Language runtimes often store objects in managed heaps. Exposing raw addresses is dangerous if objects can move during compaction. Handles or indices are safer.
Entity systems in games
Game engines commonly use entity IDs instead of direct pointers because entities may be moved, deleted, or recycled.
Caches and object stores
Large applications often return keys or handles for stored items instead of direct references, especially when the storage layer can reorganize data.
Databases and storage engines
A row ID or page ID is often safer than a direct memory address because the underlying storage may move or be rewritten.
API design for unsafe internals
Even when internals use raw pointers for speed, public APIs often wrap them in safer abstractions that control access and lifetime.
Real Codebase Usage
In real Rust codebases, developers usually combine a few patterns for this kind of problem.
1. Stable handles instead of raw pointers
A common design is:
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
pub struct Handle(usize);
The handle is passed around freely, but all real access goes through the owning heap.
2. Borrow the heap to access data
impl Heap {
pub fn get(&self, h: Handle) -> Option<&Object> { /* ... */ unimplemented!() }
pub fn get_mut(&mut self, h: Handle) -> Option<&mut Object> { /* ... */ unimplemented!() }
}
This makes Rust enforce that mutation and reading follow borrowing rules.
3. Guard collection with an explicit phase
Some designs introduce a collection or mutation session:
Common Mistakes
Mistake 1: Assuming PhantomData makes a raw pointer safe
This is a very common misunderstanding.
pub struct Object<'a> {
ptr: *mut usize,
marker: std::marker::PhantomData<&'a usize>,
}
This tells the compiler that Object<'a> behaves as if it borrowed something for 'a, but it does not make ptr automatically valid.
How to avoid it
Use PhantomData only when you already have real invariants that justify it. Do not use it as a substitute for ownership or reference safety.
Mistake 2: Returning direct pointers from a moving structure
Broken idea:
fn allocate(&mut self) -> *mut Object {
// object may move later
unimplemented!()
}
If compaction happens later, the pointer may dangle.
How to avoid it
Comparisons
| Approach | What it returns | Good for moving GC? | Main drawback |
|---|---|---|---|
| Raw pointer | Memory address | No | Becomes invalid when object moves |
Object<'a> wrapping raw pointer | Address with lifetime marker | Usually no | Lifetime does not guarantee address stability |
Reference &'a T | Borrowed access | Yes, temporarily | Cannot be stored long-term across heap mutation |
| Index/ID handle | Stable identity | Yes | Requires heap lookup for access |
| Handle with generation | Stable identity plus stale-checking | Yes | Slightly more bookkeeping |
vs handle
Cheat Sheet
Core rule
Rust lifetimes track borrow duration, not moving-GC object validity.
Prefer
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
struct Handle(usize);
impl Heap {
fn allocate(&mut self, tag: Tag) -> Handle;
fn get(&self, h: Handle) -> Option<&Object>;
fn get_mut(&mut self, h: Handle) -> Option<&mut Object>;
}
Avoid
fn allocate(&mut self) -> *mut Object;
fn allocate<'a>(&'a mut self) -> Object<'a>;
FAQ
Can Rust lifetimes prevent a garbage-collected object from moving?
No. Lifetimes only restrict how long borrowed access is allowed. They do not guarantee that an object's memory address stays the same.
Is PhantomData enough to make a raw pointer safe?
No. PhantomData can express type relationships to the compiler, but it does not make dereferencing a raw pointer valid.
Why does fn allocate<'a>(&'a mut self) -> Object<'a> cause usability problems?
Because the returned value keeps the mutable borrow alive for 'a, which often prevents more calls that also need &mut self.
What should I return instead of a raw pointer in a moving GC?
A stable handle, index, or object ID is usually the safest choice.
How do I safely access the object behind a handle?
Provide methods on the heap such as get, get_mut, or GC-session methods that validate the handle and borrow the heap.
Should reallocate return pointers or handles?
Handles are usually better. They represent object identity without exposing unstable addresses.
Is it okay to use raw pointers internally?
Yes, if necessary. The important part is to keep unsafe internal and expose a safe API with clear invariants.
Mini Project
Description
Build a small Rust heap that stores strings and returns stable object handles instead of raw pointers. This demonstrates the main API design lesson behind moving garbage collectors: callers keep identities, while the heap controls access to the current object location.
Goal
Create a heap API that allows allocation and lookup through safe handles, without exposing direct memory addresses.
Requirements
- Define a handle type that uniquely identifies an object.
- Implement
allocateto store a new object and return its handle. - Implement
getto read an object through its handle. - Implement a
compact-style method that rebuilds internal storage without changing how callers use handles.
Keep learning
Related questions
Accessing Cargo Package Metadata in Rust
Learn how to read Cargo package metadata like version, name, and authors in Rust using compile-time environment macros.
Default Function Arguments in Rust: What to Use Instead
Learn how Rust handles default function arguments, why they are not supported, and practical patterns to achieve similar behavior.
Fixing Rust "linker 'cc' not found" on Debian in WSL
Learn why Rust shows "linker 'cc' not found" on Debian in WSL and how to fix it by installing the required C build tools.