References:
[1] HardBound: Architectural Support for Spatial Safety of the C Programming Language. [paper]
The C programming language is at least as well known for its absence of spatial memory safety guarantees (i.e., lack of bounds checking) as it is for its high performance. C’s unchecked pointer arithmetic and array indexing allow simple programming mistakes to lead to erroneous executions, silent data corruption, and security vulnerabilities.
Many prior proposals have tackled enforcing spatial safety in C programs by checking pointer and array accesses. However, existing software-only proposals have significant drawbacks that may prevent wide adoption, including: unacceptably high runtime overheads, lack of completeness, incompatible pointer representations, or need for non-trivial changes to existing C source code and compiler infrastructure.
Hardbound: A hardware bounded pointer architectural primitive that supports cooperative hardware/software enforcement of spatial memory safety for C programs.
A new hardware primitive datatype for pointers that leaves the standard C pointer representation intact, but augments it with bounds information maintained separately and invisibly by the hardware.
One mode of use requires instrumenting only malloc
, which enables enforcement of per-allocation spatial safety for heap-allocated objects for existing binaries.
When combined with simple intra-procedural compiler instrumentation, hardware bounded pointers enable a low-overhead approach for enforcing complete spatial memory safety in unmodified C programs.
extends every register and word of memory in the virtual address space with a “sidecar” shadow base and bound.
setbound
instruction:
readbound
instruction:
readbase
instruction:
Bound set and propagation in registers:
Pointer arithmetic and other pointer manipulations are common in C programs. To free the compiler from the burden of explicitly maintaining and propagating bounds information (and eliminate the associated run-time overhead), the hardware automatically propagates the bounds information when a register containing a pointer is manipulated.
Bound propagation to/from Memory:
Every value in memory also conceptually has a base and bound word associated with it (Mem[add].value, Mem[addr].base, Mem[add].bound).
Metadata placed in virtual memory space, paralleling the normal data space, but offset b a constant amount.
base(addr) = SHADOW_SPACE_BASE + (addr * 2)
bound(addr) = SHADOW_SPACE_BASE + (addr *2) + 1
Pointer and Non-pointer encoding
Many pointers in C programs point to
structs
or small arrays. CCured’s success in inferring SAFE pointers indicates that often the value and base component of a pointer are identical. Furthermore, most Cstructs
are small so the differences between the pointer base and bound is also small. These observations suggest a simple mechanism for compressing the metadata: use just a few bits to encode the common case of pointers to small objects, but retain the full base/bound encoding option as a backup.
todo.
Binary Compatitbility: Yes.
Source code changes: Almost no, except malloc
.
Hardbound:
If you could revise
the fundmental principles of
computer system design
to improve security...
... what would you change?