CheriRangeChecker Pass

References:

Overview

Only in Mips, not in RISCV.

A function pass with instruction visitor: class CheriRangeChecker : public FunctionPass, public InstVisitor<CheriRangeChecker>;

Pass is initialized at

void initializeCheriRangeCheckerPass(PassRegistry &); in llvm/lib/Target/Mips/Mips.h (LLM: this init func’s implementation will be automatically generated by LLVM using macro INITIALIZE_PASS_BEGIN).

<–

llvm/lib/Target/Mips/MipsTargetMachine.cpp: LLVMInitializeMipsTarget(): initializeCheriRangeCheckerPass(*PR);

Pass is invoked at

llvm/lib/Target/Mips/MipsTargetMachine.cpp: void MipsPassConfig::addIRPasses(): addPass(createCheriRangeChecker());

Functions/Steps:

  • runOnFunction():

    • visit(F): collect pairs of range info and correspondign cast instruction in vectors: <AllocaOperands, xxxCastInst> in Casts and <AllocaOperands, ConstantCast> in ConstantCasts:
    • visitAddrSpaceCast(): get ValueSource of the cast operand: auto Src = getValueSource(ASC.getOperand(0));
      • If ValueSource is GlobalVariable and has external linkage, ignore
      • If ValueSource is Not AllocaInst, or is a call site, ignore
      • Otherwise, get the range of the operand: AllocOperands AO = getRangeForAllocation(Src);
      • getRangeForAllocation(ValueSource Src), see below.
      • Store the pair of range info and instruction in Casts
    • visitIntToPtrInst(IntToPtrInst &I2P):
      • User *P2I = testI2P(I2P): get the operand(0) as PtrToInt instruction.
      • get ValueSource of the operands of P2I
      • get the range of the operand and store the pair in Casts
    • visitRet(ReturnInst &RI)
      • get return value of the instruction: Value *RV = RI.getReturnValue()
      • If return value is a ConstantExpr:
      • get the origin of the return value: testI2P(*cast<User>(RV))
        • testI2P(): get the operand(0) as PtrToInt instruction.
      • get the range and put it in ConstantCasts
    • visitCall(CallInst &CI): for each operands of the CallInst
      • If the operand is a ConstantExpr:
      • get the origin of the operand:
        • testI2P(*cast<User>(AV)): get the operand(0) as PtrToInt instruction.
      • get the range of the operand and put is in ConstantCasts
    • Create function based on intrinsic cheri_cap_bounds_set:
    • SetLengthFn = Intrinsic::getDeclaration(M, SetLength, SizeTy);
    • For each pair stored inCasts and ConstantCasts
    • get instruction and store as I2P
      • Instruction in Casts
      • Instruction operand in ConstantCasts
    • determine insert point.
      • The instruction after the I2P in Casts
      • The Instruction in ConstantCast in ConstantCasts
    • call RangCheckedValue() to inert bound check instruction and get the new instruction contain the casted object pointer New
      • see below.
    • In all uses of I2P, replace I2P with New (Now all users are refering New)
  • User *testI2P(User &I2P): Return Operand(0) as P2I, if it is Instruction::PtrToInt and is {non-extenal global || AllocaInst || CallSite}.

    • If I2P is pointer type (dyn_cast<PointerType>(I2P.getType())), and
    • Pointer is CheriPointer (isCheriPointer(DestTy, TD.get())), and
    • Operand 0 has the opcode of Instruction::PtrToInt
    • do the following:
    • Cast the operand 0 to User and store as P2I;
    • if Operand 0 is pointer type and address space of the type is 0:
      • strip off the pointer casts of the Operand 0: Value *Src = P2I->getOperand(0)->stripPointerCasts()
      • If Src is Global Variable and not external linkage, return P2I;
      • If Src is AllocaInst or CallSite, return P2I;
      • Otherwise, return 0.
  • RangeCheckedValue(): add cheri_cap_bounds_set instruction and return a set of new instructions (IRBuilder)

    • compute the bound size of the object under cast I2P.
    • insert call to the cheri_cap_bounds_set intrinsic, set as result.
    • if offset is non-zero, insert GEP instruction, and set as result.
    • insert BitCast instruction, casting Result -> I2P, and return.
  • getRangeForAllocation():

    • recognize Heap or Stack object
    • malloc/valloc/realloc/aligned_alloc/reallocf/calloc for Heap object
    • AllocaInst for stack object,
    • get the size argument of the object, save it in AllocOperands() instance.

Main funcs

  • getRangeForAllocation(ValueSource Src)

get the range of the malloc’d and alloca’d object.

CallSite Malloc = CallSite(Src.Base); // initialize a CallSite instance to get the call instruction information.

Function * Fn = Malloc.getCalledFunction; // get function information.runOnFunction

Fn->getName(); // get the function name in string. can be malloc, valloc, realloc, aligned_alloc, reallocf, calloc. 

Malloc.getArgument(0); // first argument

// a special switch statement in C++

  switch (StringSwitch<int>(Fn->getName())
                  .Case("malloc", 1)
                  .Case("valloc", 1)
                  .Case("realloc", 2)
                  .Case("aligned_alloc", 2)
                  .Case("reallocf", 2)
                  .Case("calloc", 3)
                  .Default(-1)) {
      default:
        return AllocOperands();
      case 1:
        return AllocOperands{Malloc.getArgument(0), nullptr, Src,
                             cheri::SetBoundsPointerSource::Heap};
      case 2:
        return AllocOperands{Malloc.getArgument(1), nullptr, Src,
                             cheri::SetBoundsPointerSource::Heap};
      case 3:
        return AllocOperands{Malloc.getArgument(0), Malloc.getArgument(1),
                             Src, cheri::SetBoundsPointerSource::Heap};
      }


Summary

Get and set bound for heap and stack objects.

First, tracking the operands of different instructions, collect the object size and source instruction:

  • AddrSpaceCast, tracking object at operand 0.
  • IntToPtrInst, tracking object at operand 0.
  • Ret, tracking object at return value.
  • Call, tracking object operand i, that is a ConstantExpr whose operand(0) has operand PtrToInt.

Second, insert instruction after the object allocation site to set the bounds to the pointer.

Created Nov 1, 2019 // Last Updated Oct 2, 2020

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?