HardBound

References:

[1] HardBound: Architectural Support for Spatial Safety of the C Programming Language. [paper]

The C programming language is at least as well known for its absence of spatial memory safety guarantees (i.e., lack of bounds checking) as it is for its high performance. C’s unchecked pointer arithmetic and array indexing allow simple programming mistakes to lead to erroneous executions, silent data corruption, and security vulnerabilities.

Many prior proposals have tackled enforcing spatial safety in C programs by checking pointer and array accesses. However, existing software-only proposals have significant drawbacks that may prevent wide adoption, including: unacceptably high runtime overheads, lack of completeness, incompatible pointer representations, or need for non-trivial changes to existing C source code and compiler infrastructure.

Overview

Hardbound: A hardware bounded pointer architectural primitive that supports cooperative hardware/software enforcement of spatial memory safety for C programs.

A new hardware primitive datatype for pointers that leaves the standard C pointer representation intact, but augments it with bounds information maintained separately and invisibly by the hardware.

  • The bounds are initialized by the software
  • and they are then propagated and enforced transparently by the hardware,
  • The hardware automatically checks a pointer’s bounds before it is dereferenced.

One mode of use requires instrumenting only malloc, which enables enforcement of per-allocation spatial safety for heap-allocated objects for existing binaries.

When combined with simple intra-procedural compiler instrumentation, hardware bounded pointers enable a low-overhead approach for enforcing complete spatial memory safety in unmodified C programs.

Design

  • extends every register and word of memory in the virtual address space with a “sidecar” shadow base and bound.

    • <value, base, bound>
    • Non-pointers: base/bound are zeroes.
    • Pointers: base/bound checked upon every load and store instructions.
  • setbound instruction:

  • readbound instruction:

  • readbase instruction:

Bound set and propagation

Bound set and propagation in registers:

Figure 2: Code demonstrating implicit bounds checks and bounds propagation.

  • A failed check will raise processor exception; runtime handles expection:
    • by terminating the process, or
    • invoking some other language-specific exception.

Pointer arithmetic and other pointer manipulations are common in C programs. To free the compiler from the burden of explicitly maintaining and propagating bounds information (and eliminate the associated run-time overhead), the hardware automatically propagates the bounds information when a register containing a pointer is manipulated.

Figure 3. Bound propagation with `add`, and bounds check with `load`/`store` instructions

Bound propagation to/from Memory:

Every value in memory also conceptually has a base and bound word associated with it (Mem[add].value, Mem[addr].base, Mem[add].bound).

  • Metadata placed in virtual memory space, paralleling the normal data space, but offset b a constant amount.

    • base(addr) = SHADOW_SPACE_BASE + (addr * 2)
    • bound(addr) = SHADOW_SPACE_BASE + (addr *2) + 1
  • Pointer and Non-pointer encoding

    • tag metadata space. 1-bit per word to indicate whether the word is a pointer or not.
    • If pointer, load base/bound.
    • If not a pointer, no need to load base/bound.
    • tag metadata cache. In parallel with L1 cache.

hardbound tag metadata cache

Compressing Bounded Pointers

Many pointers in C programs point to structs or small arrays. CCured’s success in inferring SAFE pointers indicates that often the value and base component of a pointer are identical. Furthermore, most C structs are small so the differences between the pointer base and bound is also small. These observations suggest a simple mechanism for compressing the metadata: use just a few bits to encode the common case of pointers to small objects, but retain the full base/bound encoding option as a backup.

todo.

Evaluation

Binary Compatitbility: Yes.

Source code changes: Almost no, except malloc.

Hardbound:

Created Jul 5, 2019 // Last Updated Jul 17, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?