Trampolines

file: lib/libcheri/mips/libcheri_ccall_trampoline.S

This file contains “userspace implementation of libcheri invocation and return semantics”. “These vectors are intended to run on the inside of sealed call and return code capabilities, perform any necessary checks, transform the capability register file, and then jump into the target domain.

Outline of trampolines

Three types of objects and with corresponding three kinds of CCalls:

  • for rtld init: libcheri_ccall_rtld_vector: is used for rtld initialization and destruction.

  • for invoke: libcheri_ccall_invoke_vector: is used for general invocation.

  • for return: libcheri_creturn_vector:

Note: “We would most ideally share an implementation, varying the target $pcc load based on sealed-capability object type – but, unfortunately, the current userspace CCall instruction doesn’t [yet] provide reliable access to the operand sealed capabilities.”

In details

rtld init

to invoke

to return

trusted stack pointer

Macro: compute_libcheri_trusted_stack dst_cap, sbop_cap, tmp_reg1, tmp_reg2. First get offset of stack and store it in tmp_reg1, then

Macro: compute_libcheri_trusted_stack_tls_offset dst_reg, tmp_reg: Use thread-local storage to retrieve the trusted stack for the current pthread. (XXXRW: Clang appears to generate roughly this code across multiple ABIs and regardless of -fpic/-fno-pic, if -mno-abicalls is used. So, go with this in all cases, but much testing definitely required – e.g., once we have dynamically linked pure-capability binaries using.)

??? LLM: Why trusted stack pointer is stored via TLS? Is this a trustworthy place? Macro switch #ifdef __CHERI_CAPABILITY_TLS__ is used for ??? trusted stack in CheriBSD

Simple usage examples

libcheri ccall implementation

Miscellaneous

  • NO multi-threading: We provide a simply lock around each object to prevent concurrent entry; an error is returned if this is attempted. In the future, we may instead want to support multiple stacks per sandbox as well as reentrance onto the stack”.

  • Exception handlers abondoned: The design is similar to our earlier prototype based on a dedicated exception handler, in that we rely on hardware acceleration of certain checks, push and pop trusted stack frames, and clear the register file as required. There are some necessary differences:

    • (1) We enter and end with a jump-like, rather than exception enter/return semantic. This means two different code capabilities pointing at the runtime, selecting call or return semantics.
    • (2) We locate a trusted stack using the ambient environments compiler/linker-provided thread-local storage (TLS) rather than kernel per-thread state.
    • (3) Error handling is quite different, as we can’t simply jump into the kernel’s exception handler. Instead we trigger a suitable signal, or return to the originating context. (XXXRW: More here?)
  • The implementation assumes that the architecture validates:

    • cs and ds accessibility
    • cs and ds tags
    • cs and ds seals
    • cs.otype == ds.otype
    • cs and ds permissions
    • cs.offset vs cs.length
Created Aug 30, 2019 // Last Updated May 18, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?