Exploring C Semantics and Pointer Provenance


Q&A

  • What is pointer provenance?
    • Pointer Provenance: the “origin” of pointer values
    • Problem:
      • About the distinctions between integer values and pointer values: wheter they can be casted to each other; and how they should be casted properly in a ‘standard’ way
      • About what the standard meaning or requirements for being “provenance tracking”
  • How to determine provenance of a given pointer?

References:

Overview

Explore the possible source-language semantics for memory objects and pointers, in ISO C and in C as it is used and implemented in practice, focusing especially on pointer provenance.

Two proposals: Tracking provenance via integers and not.

Highlight some pros and cons and open questions, and illustrate the discussion with a library of test cases.

Integrate the provenance semantic with the Cerberus semantics for much of the rest of C.

Analyse the changes required and the resulting behaviour for a port of FreeBSD to CHERI.

A new instrumentation tool to detect possible provenance violations in normal C code, and apply it to some of the SPEC benchmarks.

Compared with a source-language variant of the twin-allocation LLVM semantics proposal of Lee et al.

Describe ongoing interactions with WG14, exploring how the proposals could be incorporated into the ISO standard.

Problem

Memory semantics of C pointers and objects: neither extreme concrete nor extreme abstract model.

  • Concrete exterme: exposes the memory semantics of underlying hardware, with memory being simply a finite partial map from machine-word addresses to bytes.
  • Abstract extreme: the language types enforces hard distinctions, e.g. between numberic types that support arithmetic and pointer types that support dereferencing.

More

Created Oct 3, 2020 // Last Updated Jun 13, 2022

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?