uXOM

$\mu$XOM1: Efficient eXecute-Only Memory on ARM Cortex-M. USENIX Security, 2019.

Background

Code injection attacks: W$\oplus$X. Since virtually all processors today are equipped with at least five basic memory permissions: read-write-execute (RWX), read-write (RW), read-execute (RX), read-only (RO) and no-access (NA). W$\oplus$X can be efficiently enforced in hardware for a memory region solely by disabling RWX.

Disclosure attacks: attemps to read part of or possibly the entire code. Code often contains intellectual properties (IPs) including core algorithems and sensitive data like cryptographic keys. Can also be abused to launch code reuse attacks (CRAs).

XOM: eXecute-Only-Memory.

  • By main memory encryption 2.
  • By hardware permission bits (execute-only) in Exoshim 3, Norax 4, and KHide 5, Readactor6.
  • XOM permission bits are generally available on servers, desktops, smartphones, but not available on embedded devices.
  • SFI-based XOM 7 kr^x 8: performs not optimally; can be circumvented (by privileged app).

Embeded devices: - Applications and kernels operates on same privilege level: - real-time constraints: mode switching is expensive 9

Cortex-M:

Intersting Exceptions feature:

  • an exception return occurs when a unique value of ESC_RETURN (e.g. 0xFFFF,FFF1) is loaded into the pc via memory load instructions, such as POP, LDM, LDR, or indirect branch instructions, such as BX.

Key Idea

$\mu$XOM converts all memory instructions into unprivileged ones and sets the code region as privileged. As a result, converted instructions cannot access code regions, thereby effectively enforcing the XO permission onto code regions.

Threat model

“The bare-metal software installed in the device is considered benign but internally holds software vulnerabilities, so that the attackers may exploit the vulnerabilities and ultimately have arbitrary memory read and write capability.”

“We do not trust any software components, including the exception handlers.”

“Event-driven nature of tiny embedded systems signifies that exception handlers can take a large portion of embedded software components10, so we cannot just assume the security of these handlers. Thus, we assume that attackers can trigger a vulnerability inside the exception handler and manipulate any data including the CPU context saved on exception entry.”

==> untrusted exception handler: LLM: Then what to protect??? How to protect??? exception handler not the protected code contents???

Challenges/Solutions

C1: unconvertible memory instructions: Some memory instructions cannot be changed into uprivileged memory instructions.

  • those access critical system resources, interrupt controller, timer, MPU, etc.
  • load/store exclusive instructions do not have unprivileged conterparts.

  • $\mu$XOM: analyze the code to exclude them from instrumentation. Resulting in C3, C3, C4.

C2. Malicious indirect branches: unconverted instructions can be abused by jump to them. Can be used to 1) turn off the MPU; 2) reading code directly.

C3. Malicious exception returns: Hardware based context save and restore for fast exception entry and return. Attackers can exploit a vulnerability while in exception handling mode to corrupt stack, such as return address.

  • If attackers corrupt the return address and then trigger an exception return by assigning EXC_RETURN value to pc, they will be able to execute any instruction in the program including the unconverted load/stores.

C4. Malicious data manipulation: attacker can control all data. Call MPU function by passing crafted arguments.

C5. Unintended instructions: alter control flow to execute unintended/unaligned/data-derived instructions in code region[^usfi]. immediate values in code; middle of 32-bit instructions.

Address C2, C3, C4: unconverted instructions in $\mu$XOM are instrumented with verification routines:

  • atomic verification: virtually enables memory instructions to be executed atomically with verification routine.

Address C5: replace danger sequence with secure equivalent. - e.g. LLM: ???

Key Takeaways

Atomic Verification Technique

Inspired by Reference monitor: verify the memory access by the unconverted load/stores.

To be atomic:

  • dedicated base register of every unconverted load/store;
  • The dedicated register must be set to a target address of each unconverted load/store immediately before the associated verification routine;
  • The dedicated register must hold a non-harmful address (i.e., not a code or the PPB address) when the atomic instruction sequence is not executed.
  • Use sp as an alternative to “dedicated” register;
    • Disable interrupts before calling verification routine.
    • Set sp as the target of unconverted load/store (such as PPB address), then call verification routine.
    • When sp changes by a non-constant: insert sp check;
    • When sp changes by a constant: (e.g. prolog/epilog), redzones: non-accessible regions around valid stack region (already on board);
    • When sp changes in exception handler: check sp. If not, the attack would avoid all the checks for sp by triggering an interrupt right after they corrupt the sp.

use sp as dedicated register to store atomic verification routine address

Another way to be atomic (in Silhouette): CFI: cannot be jumped into the middle of a function; –> but interrupts.returns has to be able to jump to, and can be abused.

CRA defence as a use case

Code disclosure in CRA:

  • directly reading code;
  • indirectly infer code layout by the value of code pointers.

Readactor11: a code diversification based CRA defence with resistance to code disclosure attacks. What Readactor do:

  • places all code in XOM to prevent the direct disclosure
  • replaces all code pointers with pointers to trampolines. Then code pointers containing the original code location are never stored in a register or memory ??? Then how to call it?

In uXOM:

  • Every function call is replaced with a direct branch to the trampoline then a call to the original function in the trampoline.
  • When original functino returns, another direct branch takes the control flow back to the original callsite.
  • Every function pointer is replaced with a pointer to the corresponding functino trampoline. ==> ??? how about the function pointer in the trampoline then?

Evaluation

LLVM compiler; Radare2 binary analysis framework12

Solution code size exe time energy
uXOM 15.7% 7.3% 7.5%
SFI-XOM 50.8% 22.7% 22.3%
uXOM-CRA 19.3% 8.6% 9.7%

  1. $\mu$XOM: Efficient eXecute-Only Memory on ARM Cortex-M. USENIX Security, 2019. ↩
  2. David Lie, Chandramohan Thekkath, Mark Mitchell, Patrick Lincoln, Dan Boneh, John Mitchell, and Mark Horowitz. Architectural support for copy and tamper resistant software. ACM SIGPLAN Notices, 2000. ↩
  3. Exoshim: Preventing memory disclosure using execute-only kernel code. Cyber Warfare and Security. 2016. ↩
  4. Norax: Enabling execute-only memory for cots binaries on aarch64. SP, 2017. ↩
  5. Preventing kernel code-reuse attacks through disclosure resistant code diversification. CNS, 2016. ↩
  6. Readactor: Practical code randomization resilient to memory disclosure. SP, 2015. ↩
  7. Leakage-resilient layout randomization for mobile devices. In NDSS, 2016. ↩
  8. kr^x: Comprehensive kernel protection against just-in-time code reuse. CCS, 2017. ↩
  9. Securing real-time microcontroller systems through customized memory view switching. In NDSS, 2018. ↩
  10. Fie on firmware: Finding vulnerabilities in embedded systems using symbolic execution. USENIX Security. 2013. ↩
  11. Readactor: Practical code randomization resilient to memory disclosure. SP, 2015. ↩
  12. unix-like reverse engineering framework and commandline tools. ↩
Created Aug 12, 2019 // Last Updated Oct 12, 2019

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?