CETS: Compiler Enforced Temporal Safety for C
Reference:
Temporal errors include:
- dangling pointer dereferences (referencing an object that has been deallocated),
- double
free
s (calling free()
on the same object multiple times),
- invalid frees (calling
free()
with a non-heap address or pointer to the middle of a heap-allocated region).
CETS: Compiler Enforced Temporal Safety.
Movivation
Temporal errors are challenging.
Prior proposals suffer from one or more of the following deficiencies:
- high runtime overheads,
- high memory overheads,
- failure to detect all temporal errors (for example, to the stack, to reallocated heap locations, or in the face of arbitrary casts)
- requiring annotations inserted by the programmer,
- or altering memory layout (which breaks compatibility with existing C code).
Example: Widely used Valgrind Memcheck tool:
- Fail to detect dangling pointers to reallocated data locations;
- Exhibits runtime overheads in excess of 10x.
Overview
CETS: Compiler Enforced Temporal Safety.
Combination of two existing techniques:
- An identifier-based scheme, which assigns a unique key for each allocation region to identify dangling pointers .
- This per-pointer metadata is tracked using a disjoint shadowspace, yielding high compatibility because memory layout is unchanged.
Optimization in prototype implementation:
- temporal-check elimination optimizations (analogous to, but different from, bounds-check elimination optimizations )
- reduces the number of checks by 70% on average.
Design and Implementation
CETS’s temporal safety properties can be guaranteed only if spatial safety is enforced too (otherwise metadata corruption could occure).
CETS lock and key implementation
CETS avoids the compatibility problems of fat pointers by borrowing the shadowspace mechanisms commonly used in location-based approaches to support per-pointer metadata without changing memory layout. See Figure 2 below:
Each pointer with two additional word-sized fields:
- (1) a unique allocation key
- (2) a lock address that points to a lock location, which is queried on temporal checks .
Each allocated memory object is associated with a unique key.
Usage of lock and key:
- Whenever memory region is allocated:
- 1. a unique key is associated with the region of the allocated memory;
- 2. a lock location is allocated;
- 3. the lock value is initialized to the key value.
- Invariant Property: the pointer is not a dangling pointer if and only if the pointer’s key and the value of the lock match.
- When a memory region is deallocated:
- 1. the lock location associated with the memory region is set to INVALID_KEY.
- The lock location can be reused but only as another lock value, and never for some other purpose.
- 2. the memory region is marked for possible reuse.
- Invariant Property: Dangling pointers holding the old key will never be used for a successful dereference since the lock is either invalid or has a different lock value.
- A mapping from keys to freeable pointers
- defenses to temporal errors:
- 1. free on pointers that are not returned by malloc().
- 2. call free twice:
- There is at most one freeable pointer associated with that key (i.e., the one originally returned by
malloc()
). (LLM: only one in all cases??? how about pointer assigned to another one, or passed to another function then freed there???).
Metadata operations
- Initialization.
- Heap allocation.
- Pointer metadata propagation.
- Dangling pointer check.
- Heap deallocation.
- Allocation/deallocation of lock addresses.
- Call stack allocations and deallocations.
- A key and corresponding lock address is associated with each stack frame.
- This key and lock pair is given to any pointer derived from the stack pointer (and thus points to an object on the stack).
- Metadata propagation with function calls.
- When pointers are passed as arguments or returned from functions, the key and lock address must also travel with them.
- Uses procedure cloning to transform every function declaration and function call site to include additional arguments for key and lock address. (LLM: backward compatibility is sacrificed.)
todo
Evaluation
Overhead :
48% for temporal check;
116% for both spatial and temporal check.