Relocation Section

References:

Relocation is the process of connecting symbolic references with symbolic definitions. For example, when a program calls a function, the associated call instruction must transfer control to the proper destination address at execution. In other words, relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process’s program image. Relocation entries are the data designed for this task.

Relocation Entries

typedef struct {
    Elf32_Addr  r_offset;
    Elf32_Word  r_info;
} Elf32_Rel; 

typedef struct {
    Elf32_Addr  r_offset;
    Elf32_Word  r_info;
    Elf32_Sword r_addend;
} Elf32_Rela;
  • r_offset: gives the location at which to apply the relocation action. For a relocatable file, the value is the byte offset from the beginning of the section to the storage unit affected by the relocation. For an executable file or a shared object, the value is the virtual address of the storage unit affected by the relocation. (LLM: What is the storage unit here???). Relocation entries for different object files have slightly differnet interpretations for the r_offset.

    • In relocatable files, r_offset holds a section offset. That is, the relocation section itself describes how to modify another section in the same file; relocation offsets designate a storage unit within the second section.
    • In executable and shared object file, r_offset holds a virtual address. To make these file’s relocation entries more useful for the dynamic linker, the section offset (file interpretation) gives way to a virtual address (memory interpretation).
    • Although the different file and memory interpretations exist for r_offset, the goal is solely to make relocation access more efficient, and the underlying meaning of r_offset stays the same.
  • r_info: gives both the symbol table index with respect to which the relocation must be made, and the type of the relocation to apply.

    • For example, a call instruction’s relocation entry would hold the symbol table index of the function being called. If the index is STN_UNDEF, the undefined symbol index, the relocation uses 0 as the symbol value.
    • Relocation types are processor-specific, descriptions of their behavior appear in the processor supplement. When the text in the processor supplement refers to a relocation entry’s relocation type or symbol table index, it means the result of applying ELF32_R_TYPE or ELF32_R_SYM, respectively, to the entry’s r_info member.
  • r_addend: a constant addend used to compute the value to be stored into the relocatable field.

A relocation section references two other sections: a symbol table and a section to modify. The section header’s sh_info and sh_link members, specify these relationships.

$ readelf -r global_array.exe

There are no relocations in this file.

$ readelf -r global_array.o

Relocation section '.rela.text' at offset 0xcd8 contains 28 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000010  0000000e00051807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_HI16 0000000000000000 set_global_int + 0
0000000000000018  0000000e00061807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_LO16 0000000000000000 set_global_int + 0
0000000000000058  0000000a00000013 R_MIPS_GOT_DISP/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 global_int + 0
00000000000000b4  0000000f00051807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_HI16 00000000000000a0 use_global_int + 0
00000000000000bc  0000000f00061807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_LO16 00000000000000a0 use_global_int + 0
00000000000000c4  0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 0
00000000000000c8  0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 0
00000000000000cc  0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
0000000000000110  0000000a00000013 R_MIPS_GOT_DISP/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 global_int + 0
0000000000000144  0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 2
0000000000000148  0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 2
000000000000014c  0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000000d8  0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
000000000000016c  0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 7
0000000000000170  0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 7
0000000000000174  0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
0000000000000154  0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000001ac  0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + e
00000000000001b0  0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + e
00000000000001b4  0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
000000000000017c  0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000001bc  0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000001f4  0000000c00051807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_HI16 00000000000001e0 main + 0
00000000000001fc  0000000c00061807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_LO16 00000000000001e0 main + 0
0000000000000200  0000000f0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
0000000000000224  0000000f0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
0000000000000218  0000000f00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
0000000000000230  0000000f00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0

...

Relocation section '.rela.stack_sizes' at offset 0xfc0 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + 0
0000000000000009  0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + a0
0000000000000012  0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + 1e0

Relocation section '.rela.debug_info' at offset 0x1008 contains 29 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000006  0000000700000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_abbrev + 0
000000000000000c  0000000800000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_str + 0
0000000000000012  0000000800000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_str + 67
0000000000000016  0000000900000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_line + 0
000000000000001a  0000000800000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_str + 76
000000000000001e  0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + 0

PLT and GOT

Refrences:

PLT (.plt) and GOT (.got) are for dynamic linking.

Relocation entry: A relocation entry in a binary is a descriptor which essentially says “determine the value of X, and put that value into the binary at offset Y” – each relocation has a specific type, defined in the ABI documentation, which describes exactly how to “determine the value of X”.

These relocation entries in binaries are left to be filled in later – at link time by the toolchain linker or at runtime by the dynamic linker.

$ cat a.c

extern int foo;
int function(void){
	return foo;
}

$ gcc -c test.c -o test.o
$ readelf --relocs ./test.o

Relocation section '.rela.text' at offset 0x1b8 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000006  000900000002 R_X86_64_PC32     0000000000000000 foo - 4

The value of foo is not known during compilation, so the compiler leaves behind a relocation entry (of type R_X86_64_PC32) which is saying “in the final binary, patch the value at offset 0x6 in this object file with the address of symbol foo”. If you take a look at the output file, you can see at offset 0x6 there are 4 bytes of zeroes just waiting for a real address:

objdump --disassemble ./test.o

test.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <function>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # a <function+0xa>
   a:	5d                   	pop    %rbp
   b:	c3                   	retq 

This will be resolved at link time.

For unresolved data or code in a dynamic libraries, or code using dynamic libraries, how to access the data or call the function which is unknown at compilation or linking time yet?

GOT and PLT provides a layer of indirection where dynamic lib loader can query the two to compute the right location:

  • GOT (Global Offset Table). The table has the place holders for unresolved data, where a dynamic loader can fill in the actual data or address to this table during the loading of the library.
  • PLT (Procedure Linkage Table). The dynamic loaders will indirect all the unresolved functions in the dynamic library to a PLT stub function. The PLT stub.
    • Only go through the PLT stub for the fist time (when target func is unresolved yet, the got.plt entry store the address of PLT stub);
      • PLT stub will call the ld.so functions to resolve the address;
      • When the function address is resolved for the first time by the dynamic loader ld.so, the loader will also save the address to the GOT (.got.plt), which replaces the PLT stub address.
    • Next time the function is called, func addr is called over GOT (.got.plt) without going through PLT stub.

More at

Relocation support in LLD

see more on Relocation in LLD.

// lld/ELF/Relocations.cpp

//===- Relocations.cpp ----------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains platform-independent functions to process relocations.
// I'll describe the overview of this file here.
//
// Simple relocations are easy to handle for the linker. For example,
// for R_X86_64_PC64 relocs, the linker just has to fix up locations
// with the relative offsets to the target symbols. It would just be
// reading records from relocation sections and applying them to output.
//
// But not all relocations are that easy to handle. For example, for
// R_386_GOTOFF relocs, the linker has to create new GOT entries for
// symbols if they don't exist, and fix up locations with GOT entry
// offsets from the beginning of GOT section. So there is more than
// fixing addresses in relocation processing.
//
// ELF defines a large number of complex relocations.
//
// The functions in this file analyze relocations and do whatever needs
// to be done. It includes, but not limited to, the following.
//
//  - create GOT/PLT entries
//  - create new relocations in .dynsym to let the dynamic linker resolve
//    them at runtime (since ELF supports dynamic linking, not all
//    relocations can be resolved at link-time)
//  - create COPY relocs and reserve space in .bss
//  - replace expensive relocs (in terms of runtime cost) with cheap ones
//  - error out infeasible combinations such as PIC and non-relative relocs
//
// Note that the functions in this file don't actually apply relocations
// because it doesn't know about the output file nor the output file buffer.
// It instead stores Relocation objects to InputSection's Relocations
// vector to let it apply later in InputSection::writeTo.
//
//===----------------------------------------------------------------------===//

COPY relocations

reference: Copy Relocations

  • RELRO
  • References: Hardening ELF binaries using Relocation Read-Only(RELRO) RELRO ELF: Executable and Linkable Format. PIE: Position Independent Executables. RELRO: Relocation Read-Only. In dynamic linked ELF: GOT: Global Offset Table. A look-up table, contains pointers that points to the actual location of dynamically resolved functions. Lives in .got.plt section. Located at a static address. Needs to be writable. —> can be overflowed by attackers. dynamically populcated as the program is running: first time GOT points back to PLT(inside a dynamic linker procedure), the dynamic linker finds the actual location, then written to GOT.

Created Jul 28, 2020 // Last Updated Aug 5, 2020

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?