Control-C (in 2002 CASES1) , a subset of C, but with key restrictions designed to ensure that memory safety of code can be verified entirely by static checking, under certain system assumptions.
T1. Requires strong typing of all functions, variables, assignments, and expressions, using the same types as in C.
T2. Disallows casts to or from any pointer type. Casts between other types (e.g., intergers, floating point numbers, and characters) are allowed.
T3. A union can only contain types that can be cast to each other; consequently a union cannot contain a pointer type.
T4. No uses of uninitialized local pointer variables within a procedure. A pointer variable must be assigned a value before it is used or address is taken.
For example:
int a, *p, **pp;
pp = &p;
print (**pp);
p = &a;
p
is uninitialized and cannot be assigned to pp
. Dereferencing p via **pp is potentially unsafe and this violation would be difficult to detect.
The language therefore disallows taking the address of an uninitialized pointer.
T5. If reserved address is used, then any individual data object should be no larger than the size of the reserved address range.
T6. Pointer arithmetic is disallowed.
strlen
followed by explicit array index operations; A safe string library operations can also be provided.One of the fundamental limits in static program analysis lies in the analysis of contraints on symbolic integer expressions. For ensuring safety, the compiler must prove (symbolically) that the index expressions in an array reference lie within the corresponding array bounds on all possible execution path. For each index expression, this can be formulated as an integer contraint system with equalities, inequalities, and logical operators used to represent the computation and control-flow statements of the program.
“Unfortunately, integer constrains with multiplication of symbolic variables is undecidable”.
A broad, decidable class of symbolic integer constraints is Presburger arithmetic, which allows addition, subtraction, multiplication by constants, and the logical connectives $\vee, \wedge, \neg, \exists $ and $\forall$. Omega library provides an efficient implementation that has been widely used for solving such integer programming problems in compiler research.
Exploiting static analysis based on Presburger arithmetic requires that our language only allows linear expressions with constant(known) coefficients for all computations that determine the set of values accessed by an array index expression.
Definition of affine transformation: Let $F: R^n \rightarrow R$, then a transformation, $F$, is said to be affine if $F(\overrightarrow{\rm p}) = C\overrightarrow{\rm p} + \overrightarrow{\rm q}$, where $C$ is any linear transformation, and $\overrightarrow{\rm q}$ contains only constants or known symbolic variables independent of $\overrightarrow{\rm p}$.
In the following, we assume affine transformations with known constant integer coefficients ($C$).
Array operations in Control-C must obey the following rules. On all control flow paths,
A3. If an array, A, is accessed inside a loop, then:
A4. If an array is accessed outside of a loop then
The compiler can statically check a program satisfies A1 only if the additional language rules A2 – A4 are obeyed.
Example:
... /*** caller function ***/
if (( k > 0) && (k < 5)) {
B = (int *)RMalloc(k * 30);
C = initialize(B, 20);
}
int * initialize(int *B, int n) {
int *A,m;
if (B[n] > 0) {
A = (int *) RMalloc(5 * B[n] + 10);
for (m=0; m < B[n]; ++m) A[m*4] = m ;
return A;
} else
return null;
}
The code above is a valid Control-C because:
Once the above three conditions are satisfied, the compiler proves A1 for all array accesses.
Note that for proving safety of $B[n]$, the compiler has to correctly include the constraint on $k$ ($ k < 5 $) and only then it can verify that $n(=20)$ is less than the size of $B$.
If you could revise
the fundmental principles of
computer system design
to improve security...
... what would you change?