Cheri Domain


Reference123.


Questions

  1. Why do we need domain?

  2. Globals Across Domains

    • Globals are accessible to all domains, no distinguish. This means if some data is passed from Domain A to Domain B, an unrelated domain, say Domain C, will also have a chance to know the data.
  3. Untagged Data Across Domains

    • The untagged data in the argument list is not checked when do a CCall. Is it possible to leak information?
  4. Performance Problem

    • Memory footprints caused by larger size of pointers in pointer intensive applications: 46% on average for Olden microbenchmark.
      • How to selectively use capabilities and traditional pointers?
    • CCall/CReturn overhead.
      • Pure-hardware solution, but still with flexibility for OS settings? Like MMU as hardware acceleration for page tables?

Motivation

(from 2 and 1) Compared with exploit mitigation, which targets attack-vector characteristics, compartmentalization:

  • limits privileges and further attack surfaces available to attackers;
  • mitigate a broader class of attack;
  • does not depend on knowledge of specific attack vectors, and is resistant to an arms race as attack and defense co-evolve;

Challenges:

  • complexity, programmability, scalability.

Overview

Security models

A variety of object-capability semantics have been proposed, and we would like to be able to explore many of them on a single platform. For example, prior work has seen disagreement on synchronicity for object-capability invocation: asynchronous primitives allow callers to avoid placing trust in callee termination, but current software designs incorporate strong assumptions of synchronicity 4.

User-level C-language TCBs

A userspace address-space executive, with code spanning libc libcheri, is responsible for security-critical TCB functions such as memory management and class loading.

The executive configures memory protection to implement isolation, safely allocates (and reallocates; e.g. via garbage collection) memory and objects, loads class code and passes initial capabilities for both memory and communications into new objects.

Useful comparison can be made between the address-space executive and both microkernels and language security-model runtimes (e.g. Java). Unlike a microkernel, the executive resides within a UNIX process; like Java support for native code, the executive is responsible for coordinating communication between compartments and general OS services. Unlike the Java security model, code injection attacks are part of the threat model, and containment is maintained even if unexpected unexpected instructions enter execution.

Sealing

Capabilities are sealed using CSeal instruction, which accepts two capability-register operands:

  • the code or data memory capability to be sealed, and
  • a second capability with the Permit_Seal permission set.

The virtual address of a capability with Permit_Seal set is treated as a type. Object types prevent instance data from being used with the wrong class.

Implementation

Object-capability invocation

  • CCall/CReturn
  • rely on software exception handlers to partially implement both instructions.
  • CCall performs
    • Hardware checks, sealing, suitable permissions, matching types
    • select exception vector according to results
  • CReturn
    • Trigger a software exception without checks
    • May be eschewed if CCall is used as an asynchronous message-send primitive.
  • Exception handers:
    • flush sensitive registers before transition
    • CHERICCallVector -> CHERICCall || CHERICReturn

Compiler Changes

(2015 SP): the compiler generates codes that captures object bounds, pointer-integrity properties, and control flow. A new domain-crossing calling convention.

Modified 8KLoC in 2MLoC of total LLVM/Clang code, including LLVM C front end, MIPS backend, and target-independent optimizers.

  • Capabilities are used wherever possible to limit accidental buffer overruns, protect pointers (including those used in control flow.)
  • A new calling convention, CHERI_CCall, for functions that can be invoked across domains.
    • with knowledge of the function type, only the compiler is aware of which argument and return-value registers are used.
    • Compiler clears unused argument registers in the caller context, and unusded return registers in the callee context. CCall and CReturn are responsible for clearing other registers.

Annotations & Compiler

  • cheri_ccall: compiler will replace calls to it with specially crafted calls to cheriinvoke, with method number (identified by a global variable that is initialised by the sandbox loader) in correct register. This annotation provides two declarations of the function:

    • cheri_method_suffix: specifies a suffix that takes an explicit CHERI class argument.
    • cheri_method_class: specifies a global variable that automatically sets the class (code and data capability) argument.
  • cheri_ccallee: will use the ccall calling convention. The compile will zero any return registers. Allow functions to be declared that are usable both inside and between compartments.

    • Functions with ccall calling convention have two extra capability argument registers, $c1 and $c2, which contains the code and data capabilities, an extra integer argument register, $v0, which contains the method number. These are not exposed to functions marked as using the ccallee calling convention, which accept the normal calling convention’s argument registers.

OS changes

Added 4KLoC in 13MLoC kernel code. More details: kernel changes

  • Per-thread trusted stack that links a chain of disjoint, pre-compartment stacks used by each object executing in the thread.
  • Each invoation, CCall saves a code and data capability that CReturn will use to resume.
  • The caller is responsible for setting the invoked data capability ($idc) to a memory region (typically on the caller’s stack) that contains everything needed to restore state.

(2015, programmer’s guide:) Kernel implements CCall and CReturn via a combined exception handler: CHERICCallVector -> CHERICCall or CHERICReturn. The handlers performs tests and following ABI specifications. (from ABI: invoked code capability is in C1, data cap in C2).

CCall will push IDC, PCC, and PC+4; CReturn pops them. PC will be in PCC in the future (2015, programmer’s guide).

ISA CCall

  • Checks the provided sealed code ($scc) and data (`$sdc) capabilities are valide and properly sealed, and have matching types and suitable permissions;
  • Checks that argument capabilities are either untagged or have the Global permission. (Untagged data and globals are not protected across domains. Then, how to constrain a Global to only accessible by limited number of domains? domains ontop of domains? layers?).

ISA CReturn

  • Validating that any returned capability is global or NULL;
  • Clearning non-return registers;
  • Poping and restoring $pcc and $idc.

Microarchitecture Changes

  • minor additions to implement compartmentalization-focused ISA extension, sealing mechanism, and capability flow control.
  • CClearRegs: modification to register forwarding logic. zero mask on register file: a single bit for each register indicates whether a read should return zero.

Capability Flow Control

Addressing the temporal issues when memory is passed between protection domains.

CHERI has a mechanism to allow some capabilities to be shared but not others. (UCAM-CL-TR-940)

  • 2-bit capability: Global vs Local cap;

  • Global may be stored via any writable capability;

  • Local may be stored only via capabilities that themselves have the Permit_Store_Local permission set.

  • CHERIBSD: heap references as global; stack references as local. Sharing of stack data between protection domains is prevented.

  • If compartments are setting up with just the authority to read capabilities and to store global capabilities to a shared region, they cannot exchange their local capabilities.(UCAM-CL-TR-940)

    Performance

Memory Footprints

  • A worst-case analysis for linked-list and tree traversal operation.
  • Olden microbenchmark suite.
  • Use Capabilities for all pointers.
  • 46% overhead on average in Olden microbenchmark suite.

Domain Crossing Overhead

Total cycle count
Total cycle count, spanning userspace and kernel, for a zero-byte memcpy in a sandbox
  • A best-case invoke and return analysis: CCalling a zero-byte memcpy.
  • With hardware argument validation, i.e., object type checking in CCall: -44 cycles(5.5%).
  • With CClearRegs instruction: -172 cycles(21%).

zlib Compartmentalization

Compression time for gzip with library compartmentalization
Compression time for gzip with library compartmentalization
  • Comparison: CHERI + Process-based Compartmentatlization (Capsicum) + unmodified version.
  • CHERI: a small, near constant, overhead.

    • domain switch once.
    • shares memory using capabilities.
  • Capsicum: linear overhead.

    • Must transfer data using IPC.

Scalable Software Compartmentalization, Oakland, 2015.

More details

  • Cheri Compartmentalization Papers
  • References: reference More 2016 Thesis: Hardware support for compartmentalisation References: Norton, Robert M. Hardware support for compartmentalisation. No. UCAM-CL-TR-887. University of Cambridge, Computer Laboratory, 2016. More 2017 Asplos: CHERI JNI: Sinking the Java security model into the C References: Chisnall, David, Brooks Davis, Khilan Gudka, David Brazdil, Alexandre Joannou, Jonathan Woodruff, A. Theodore Markettos et al. “CHERI JNI: Sinking the Java security model into the C.

  • Cheri Object Types
  • Questions from ISAv7 ch3.3.3: The capability is unsealed ( has otype of $2^64 -1$ ). What the $2^64 -1$ comming from? don’t we only have 23 bits for object type? ans01: in version 7 of ISA, sealed/unsealed capabilities are no longer distinguished by flag s. Unsealed capabilities were redefined as having otype of $2^64 -1$ and this bit was reclaimed as reserved. User/Kernel Split “The CHERI object-type space is split between userspace and kernel, permitting kernel object references to be delegated to userspace (if desired).

  • Cheri Permission Constants
  • CHERI ISA defined permissions 11 permission bits are hardware defined in CHERI ISA. //file: // ./sys/mips/include/cherireg.h /* * CHERI ISA-defined constants for capabilities -- suitable for inclusion from * assembly source code. * * XXXRW: CHERI_UNSEALED is not currently considered part of the perms word, * but perhaps it should be. */ #define CHERI_PERM_GLOBAL (1 << 0) /* 0x00000001 */#define CHERI_PERM_EXECUTE (1 << 1) /* 0x00000002 */#define CHERI_PERM_LOAD (1 << 2) /* 0x00000004 */#define CHERI_PERM_STORE (1 << 3) /* 0x00000008 */#define CHERI_PERM_LOAD_CAP (1 << 4) /* 0x00000010 */#define CHERI_PERM_STORE_CAP (1 << 5) /* 0x00000020 */#define CHERI_PERM_STORE_LOCAL_CAP (1 << 6) /* 0x00000040 */#define CHERI_PERM_SEAL (1 << 7) /* 0x00000080 */#define CHERI_PERM_CCALL (1 << 8) /* 0x00000100 */#define CHERI_PERM_UNSEAL (1 << 9) /* 0x00000200 */#define CHERI_PERM_SYSTEM_REGS (1 << 10) /* 0x00000400 */ .

  • Rings
  • Reference 1 (From v7 2.3.14) Use of privileged features within privileged rings, depends on the program-counter capability having a suitable hardware permission set, rather than the traditional permissions in virtual memory as the supervisor. This feature allows code within kernels, microkernels, and hypervisors to be compartmentalized, preventing bypass of the capability model within the kernel virtual address space through control of virtual memory features. The feature also allows vulnerability mitigation by allowing only explicit use of privileged features: kernel code can be compiled and linked so that most code executes with a program-counter capability that does not authorize use of privilege, and only by jumping to selected program-counter capabilities can that privilege be exercised, preventing accidental use.

  • Cheri Seal
  • Questions How can Java/C++ benefit from CHERI’s sealed caps? Can Rust ownership benefit from CHERI’s sealed caps? Why explicit unseal/ccall (ISAv71 Ch 8.18): Unseal is an explicit operation. In CHERI, it requires explicit operation to convert an undereferenceable object into a pointer. CUnseal or CCall. An alternative architecture would have been one with implicit* unsealing, where a sealed capability could be dereferenced without explicitly unsealing it first, provided that the subsystem attempting the dereference had some kind of ambient authority that permitted it to dereference sealed capabilities of that type.

  • Example: User Defined Object Capabilities
  • The object type in user space is splited into two ranges: non-system type numbers: [1, $2^{22}$ - 1 ]; system type numbers: [$2^{22}$, $2^{23}$ - 1 ]; while the object types in kernel space is [$2^{23}$, $2^{24}$ -1 ]. Example 1: static sandboxes Code is derived from cheritest_ccall.c. (Note: This portion of testing code in CheriBSD is commented out in cheritest.c, which means no available by default.

  • Example: System class object capabiilties
  • The object type in user space is splited into two ranges: non-system type numbers: [1, $2^{22}$ - 1 ]; system type numbers: [$2^{22}$, $2^{23}$ - 1 ]; while the object types in kernel space is [$2^{23}$, $2^{24}$ -1 ]. Cheri System class is the CHERI system type of the system library. Object type: libcheri_system_type Overview IN file: lib/libcheri/libcheri_system.h: The header defines the interface for the libcheri system class.

  • Libcheri CCall and Trampolines
  • libcheri_ccall.c This file provides the C implementation of the userspace CCall trampoline for libcheri. Three object types are used: CCall path into rtld initialization, invocation, and CReturn. For CCall data capabilities, we use the sandbox object pointer, where we can find any data required to perform the domain transition, including a suitable capability for use with TLS. Currently, this means deriving these sealed data capabilities from DDC. (LLM: ??? where to derive in the future)

  • libcheri Stack
  • file: libcheri_stack.c This file implements the “trusted stacks for the libcheri compartmentalization model”. “Each pthread has its own trusted stack that tracks calls between libcheri objects; frames contain the information required to recover control safely to the caller context (in another protection domain) both in the event of a CRetrun from the callee, and in the event of a trusted-stack unwind due to an exception termination execution in a object permaturely.

  • libcheri Enter
  • libcheri_enter_init() initializes the landing-pad environment for system-object invocation. It implements a stack landing pad for system classes provided by libcheri. The single stack is statically allocated – meaning no concurrent invocation from sandboxes in multiple threads (or reentrantly). Currently, that is ensured by virtue of applications not themselves invoking sandboxes concurrently. // file: // lib/libcheri/libcheri_enter.c void libcheri_enter_init(void) { /* XXX: Should be MAP_STACK, but that is broken. */ __libcheri_enter_stack = mmap(NULL, LIBCHERI_ENTER_STACK_SIZE, PROT_READ | PROT_WRITE, MAP_ANON, -1, 0); assert(__libcheri_enter_stack !

  • libcheri Sandbox
  • Initialization of system-class sandbox see libcheri system sandbox Initialization of user defined sandbox sandbox_init() in libcheri_sandbox.c; Jobs include: get current process text file and open it; sandbox_parse_ccall_methods: parse elf binary file (CHERI_CALLEE, and CHERI_CALLER sections) get a list of sandbox_provided_classes (class, methods, offset), and its required methods (name, class, offset). sandbox_update_main_required_method_variables, use the returned required method list from last step, updating the data structure in the sandbox (initialize the vtable offset); create libcheri_system_vtable, libcheri_fd_vtable

  • libcheri_invoke: CCall Implementation in CheriBSD
  • libcheri_invoke is defined in assembly code; two versions for different ABIs respectively. Hybrid libcheri_invoke The Hybrid MIPS ABI (github): # file: lib/libcheri/mips/libcheri_invoke_hybrid.S # # Assembly wrapper for CCall on an object-capability. Its function is to save # and restore any general-purpose and capability registers needed on either # side of CCall, but not handled by the compiler. This is done by creating an # on-stack frame which will be pointed to by $idc before CCall, and then # unwrapping it again.

  • Stores
  • Reference: CCall Examples References: [1] Stack Underflow //file: ./bin/cheritest/cheritest_libcheri_trustedstack.c /* * Perform a return without a corresponding invocation, to underflow the * trusted stack. */ void test_sandbox_trustedstack_underflow(const struct cheri_test *ctp __unused) { struct cheri_object returncap; void * __capability codecap /* currently ignored: asm ("$c1") */; void * __capability datacap /* currently ignored: asm ("$c2") */; returncap = libcheri_make_sealed_return_object(); codecap = returncap.co_codecap; datacap = returncap.co_datacap; /* * TODO: the branch delay slot has been removed.


  1. Fast Protection-Domain Crossing in the CHERI Capability-System Architecture, MICRO, 2016. ↩
  2. CHERI: A Hybrid Capability-System Architecture for ↩
  3. CHERI Programmer’s Guide, UCAM-CL-TR-877, 2015. ↩
  4. EROS: a fast capability system. by Shapriro, J., Smith, J., and Farber, D. SOSP, 1999. ↩
Created Jun 27, 2019 // Last Updated Jul 13, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?