Thoughts

Syntax, Semantics, and Segfaults: From Theory to Practice - Rust's Type System in C++

J

27 Feb 2025 — 7 min read

Introduction

In our previous analyses, "Syntax, Semantics, and Segfaults: A Cross-Disciplinary Analysis" and "Syntax, Semantics, and Segfaults: A Formal Perspective," we explored the theoretical foundations that govern programming languages, from the philosophical underpinnings of meaning to the rigorous mathematical frameworks that ensure program correctness. We examined how type systems can prevent runtime errors through formal guarantees and how languages like Rust implement ownership and borrowing to ensure memory safety without garbage collection.

This article bridges theory and practice by implementing Rust's ownership and borrowing system in modern C++. By constructing a simulation of Rust's type system, we gain deeper insights into both the theoretical foundations and practical challenges of type-safe programming. Our implementation serves as a concrete demonstration of abstract concepts like "progress" and "preservation," showing how formal guarantees manifest in actual code.

As Benjamin Pierce notes in Types and Programming Languages, "A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute." Our C++ implementation makes this definition tangible—showing how a carefully designed system of classes and constraints can prove the absence of memory errors at compile time.

The Formal Foundations of Ownership and Borrowing

Ownership as a Formal System

Rust's ownership system can be formalized as a linear type system where resources (values) have exactly one owner at any given time. In formal terms, we can represent this with a judgment:

$$\Gamma \vdash e : \tau \; \mathbf{owned}$$

This judgment states that in context $\Gamma$, expression $e$ has type $\tau$ and is uniquely owned. When ownership transfers (through moves), the previous binding becomes invalid—a property we can express through a linear typing context where variables cannot be used more than once.

In our C++ implementation, we encode this with the Owner<T> class, which explicitly disables copying and only allows moving:

// Disable copy constructor (like Rust)
Owner(const Owner&) = delete;
Owner& operator=(const Owner&) = delete;

// Move constructor (Rust's ownership transfer)
Owner(Owner&& other) noexcept : 
  data(std::move(other.data)),
  immut_borrow_count(std::exchange(other.immut_borrow_count, 0)),
  mut_borrowed(std::exchange(other.mut_borrowed, false)) {}

This directly encodes the linear typing judgment: after moving, the source variable becomes invalid, preventing double-use of resources.

Borrowing Rules as Type Judgments

Rust's borrowing system can be expressed through formal type judgments that constrain reference creation. For immutable references, we can write:

$$\frac{\Gamma \vdash e : \tau \; \text{owned}}{\Gamma \vdash \text{ref}(e) : \text{ref}(\tau)}$$

This states that if $e$ is an owned value of type $\tau$, then $\text{ref}(e)$ is a reference to $\tau$. For mutable borrowing, we need an additional constraint that no other borrows exist:

$$\frac{\Gamma \vdash e : \tau \; \text{owned} \quad \text{noborrow}(e)}{\Gamma \vdash \text{ref}_{\text{mut}}(e) : \text{ref}_{\text{mut}}(\tau)}$$

Where $\mathbf{noborrow}(e)$ is a predicate indicating that $e$ currently has no active borrows.

Our C++ implementation encodes these rules through counting and flags:

// Create immutable reference from Owner
explicit constexpr Ref(const Owner<T>& o) : owner(&o) {
  // Cannot create immutable reference if mutable reference exists
  assert(!owner->mut_borrowed && "Cannot borrow immutably while mutably borrowed");
  ++owner->immut_borrow_count;
}

// Create mutable reference from Owner
explicit constexpr MutRef(Owner<T>& o) : owner(&o) {
  // Cannot create mutable reference if other references exist
  assert(!owner->mut_borrowed && owner->immut_borrow_count == 0 && 
         "Cannot borrow mutably when other borrows exist");
  owner->mut_borrowed = true;
}

These constructors implement the formal type judgments, enforcing Rust's borrowing rules at runtime through assertions (in a real compiler, these would be compile-time checks).

Type Systems and Memory Safety: From Theory to Code

Modeling Progress and Preservation

As discussed in our formal analysis, type safety consists of two key properties:

Progress: Well-typed terms don't get stuck (except at values)
Preservation: Reduction preserves types

Our C++ implementation demonstrates these properties:

Progress is ensured because operations on Owner<T>, Ref<T>, and MutRef<T> are always defined for valid states. Invalid operations (like dereferencing a null owner) are explicitly checked and result in controlled failures (assertions).
Preservation is maintained because our references track their lifetimes and prevent invalid access. When a reference goes out of scope, its destructor decrements the borrow count, ensuring type consistency throughout execution.

Small-Step Operational Semantics in Action

In formal operational semantics, we define transitions between program states. For example, the rule for borrowing might be:

$$\frac{H, e \Rightarrow^* H', v \quad v \mapsto a \in H'}{H, \text{borrow}(e) \Rightarrow H', \text{ref}(a)}$$

This states that to borrow expression $e$, we first evaluate it to a value $v$ pointing to address $a$, then create a reference to $a$.

Our C++ implementation makes these abstract steps concrete:

// Create immutable reference (Rust's &T)
[[nodiscard]] constexpr Ref<T> borrow() const {
  return Ref<T>(*this);
}

When borrow() is called, our implementation performs exactly the steps described in the operational semantics rule: it creates a reference to the owner, increments the borrow count, and returns the reference.

Ownership and Borrowing in Practice

Let's examine how our implementation enforces ownership and borrowing rules through an annotated code walkthrough:

auto x = Owner<int>(5);  // Create an owner of an integer
std::cout << "x = " << *x << std::endl;  // Dereference is safe

{
  // Create immutable reference - increments borrow count
  auto r1 = x.borrow();

  // Can create another immutable reference - multiple & references allowed
  auto r2 = x.borrow();

  // Attempting mutable borrow would fail:
  // auto rm = x.borrow_mut(); // Runtime assertion would fail
} // r1 and r2 destructors decrement borrow count

{
  // Now can create mutable reference - no active borrows
  auto rm = x.borrow_mut();
  *rm = 10;  // Can modify through mutable reference

  // Attempting immutable borrow would fail:
  // auto r3 = x.borrow(); // Runtime assertion would fail
}

// Ownership transfer
auto y = std::move(x);
// x no longer has a value
assert(!x.has_value());  // This assertion passes

This example demonstrates fundamental Rust concepts:

Ownership of values
Creating and destroying borrows
The relationship between mutable and immutable borrows
Transferring ownership through moves

Type Specialization for Memory Safety

Our implementation also demonstrates how type-level distinctions can improve memory safety. We use C++'s concepts to specialize behavior for copyable vs. non-copyable types:

// Concepts definition
template<typename T>
concept Copyable = std::is_trivially_copyable_v<T>;

template<typename T>
concept NonCopyable = !Copyable<T>;

This allows us to handle primitive types (like int) more efficiently by storing them directly on the stack, while complex types (like std::vector) use proper heap allocation. Both implementations still enforce the same ownership and borrowing rules, showing how type-level distinctions can improve performance without sacrificing safety.

Leveraging C++23 Features for Formal Reasoning

Modern C++23 features enhance our ability to implement and reason about formal systems:

Expected and Result Types for Error Handling

The std::expected<T, E> type in C++23 closely parallels Rust's Result<T, E> type, allowing us to model error cases explicitly:

// Result type for error handling (similar to Rust's Result<T, E>)
template<typename T, typename E = std::string_view>
using Result = std::expected<T, E>;

// Try to borrow with explicit error handling
Result<Ref<T>> try_borrow() const {
  if (mut_borrowed) {
    return std::unexpected("Cannot borrow immutably while mutably borrowed");
  }
  return Ref<T>(*this);
}

This aligns with formal semantics by making error states explicit in the type system, rather than relying on exceptions or undefined behavior. It corresponds to a judgment form like:

$$\Gamma \vdash e : \text{Result}<\tau, \epsilon>$$

Which states that expression $e$ either produces a value of type $\tau$ or an error of type $\epsilon$.

Concepts as Formal Type Classes

C++20/23 concepts provide a way to express type constraints similar to formal type classes:

template<typename T>
concept Copyable = std::is_trivially_copyable_v<T>;

This is analogous to a type judgment of the form:

$$\Gamma \vdash T \; \textbf{satisfies} \; \text{Copyable}$$

Meaning type $T$ satisfies the constraints of the Copyable concept. These constraints are checked at compile-time, providing static guarantees about type properties.

Connection to Formal Verification

Our implementation bridges the gap between theoretical verification and practical code. In formal verification, we might prove a theorem like:

Theorem (No Use After Move): If variable $x$ is moved, then any subsequent use of $x$ will result in a type error.

In our implementation, this property is enforced through the combination of:

Deleting copy constructors
Tracking ownership through move semantics
Checking validity before operations

constexpr T& operator*() {
  assert(has_data && "Dereferencing an empty Owner");
  return data;
}

Similarly, we enforce the exclusivity of mutable references:

Theorem (Mutable Borrow Exclusivity): If a mutable reference to variable $x$ exists, no other references to $x$ can exist simultaneously.

Our runtime checks ensure this property:

// Cannot create mutable reference if other references exist
assert(!owner->mut_borrowed && owner->immut_borrow_count == 0 && 
       "Cannot borrow mutably when other borrows exist");

Limitations and Bridge to Future Work

While our implementation demonstrates key concepts, it has limitations compared to Rust's actual implementation:

Compile-time vs. Runtime Checking: Rust performs borrow checking at compile-time, while our implementation uses runtime assertions.
Lifetime Elision: Rust's compiler implements sophisticated lifetime elision rules, which we cannot capture in C++.
Interior Mutability: We don't model Rust's RefCell or other interior mutability patterns.

Future work could address these limitations by:

Using more sophisticated template metaprogramming to move checks to compile-time
Exploring static analysis tools to verify our implementation
Extending the model with additional Rust concepts like lifetime bounds

Conclusion: The Value of Implementation as Understanding

By implementing Rust's ownership and borrowing system in C++, we've created a tangible manifestation of the formal concepts discussed in our previous analyses. This implementation helps bridge the gap between abstract theory and practical programming, showing how formal judgments and rules translate into executable code.

The process of implementation forces us to confront the practical challenges of realizing formal systems. It reveals edge cases, performance considerations, and usability concerns that might be overlooked in pure theory. Simultaneously, understanding the formal foundations helps us create more robust implementations by identifying invariants that must be maintained and properties that must be guaranteed.

As demonstrated in our code, modern C++ provides powerful tools for expressing formal concepts, from template metaprogramming to concepts to monadic error handling. These features allow us to create increasingly sophisticated type systems that can enforce properties previously only expressible in formal calculi.

In the ongoing dialogue between theory and practice, implementations like ours serve as crucial translational artifacts. They help practitioners understand theoretical concepts through familiar code, while giving theoreticians concrete models to refine their abstractions. By continuing this conversation across disciplines, we advance both the theory and practice of programming languages, gradually narrowing the gap between what we can prove on paper and what we can enforce in code.

The journey from syntax to semantics to safety is not just a theoretical exercise—it's a practical path toward software that is correct by construction. As our implementation shows, the formal guarantees of type systems can indeed be realized in practice, bringing us closer to the ideal of programs that simply cannot go wrong.