References:
Relocation is the process of connecting symbolic references with symbolic definitions. For example, when a program calls a function, the associated call instruction must transfer control to the proper destination address at execution. In other words, relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process’s program image. Relocation entries are the data designed for this task.
typedef struct {
Elf32_Addr r_offset;
Elf32_Word r_info;
} Elf32_Rel;
typedef struct {
Elf32_Addr r_offset;
Elf32_Word r_info;
Elf32_Sword r_addend;
} Elf32_Rela;
r_offset
: gives the location at which to apply the relocation action. For a relocatable file, the value is the byte offset from the beginning of the section to the storage unit affected by the relocation. For an executable file or a shared object, the value is the virtual address of the storage unit affected by the relocation. (LLM: What is the storage unit here???). Relocation entries for different object files have slightly differnet interpretations for the r_offset
.
r_offset
holds a section offset. That is, the relocation section itself describes how to modify another section in the same file; relocation offsets designate a storage unit within the second section.r_offset
holds a virtual address. To make these file’s relocation entries more useful for the dynamic linker, the section offset (file interpretation) gives way to a virtual address (memory interpretation).r_offset
, the goal is solely to make relocation access more efficient, and the underlying meaning of r_offset
stays the same.r_info
: gives both the symbol table index with respect to which the relocation must be made, and the type of the relocation to apply.
STN_UNDEF
, the undefined symbol index, the relocation uses 0 as the symbol value
.ELF32_R_TYPE
or ELF32_R_SYM
, respectively, to the entry’s r_info
member.r_addend
: a constant addend used to compute the value to be stored into the relocatable field.
A relocation section references two other sections: a symbol table and a section to modify. The section header’s sh_info
and sh_link
members, specify these relationships.
$ readelf -r global_array.exe
There are no relocations in this file.
$ readelf -r global_array.o
Relocation section '.rela.text' at offset 0xcd8 contains 28 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000010 0000000e00051807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_HI16 0000000000000000 set_global_int + 0
0000000000000018 0000000e00061807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_LO16 0000000000000000 set_global_int + 0
0000000000000058 0000000a00000013 R_MIPS_GOT_DISP/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 global_int + 0
00000000000000b4 0000000f00051807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_HI16 00000000000000a0 use_global_int + 0
00000000000000bc 0000000f00061807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_LO16 00000000000000a0 use_global_int + 0
00000000000000c4 0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 0
00000000000000c8 0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 0
00000000000000cc 0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
0000000000000110 0000000a00000013 R_MIPS_GOT_DISP/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 global_int + 0
0000000000000144 0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 2
0000000000000148 0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 2
000000000000014c 0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000000d8 0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
000000000000016c 0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 7
0000000000000170 0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + 7
0000000000000174 0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
0000000000000154 0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000001ac 0000000600000014 R_MIPS_GOT_PAGE/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + e
00000000000001b0 0000000600000015 R_MIPS_GOT_OFST/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .rodata.str1.1 + e
00000000000001b4 0000000d0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
000000000000017c 0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000001bc 0000000d00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 printf + 0
00000000000001f4 0000000c00051807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_HI16 00000000000001e0 main + 0
00000000000001fc 0000000c00061807 R_MIPS_GPREL16/R_MIPS_SUB/R_MIPS_LO16 00000000000001e0 main + 0
0000000000000200 0000000f0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
0000000000000224 0000000f0000000b R_MIPS_CALL16/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
0000000000000218 0000000f00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
0000000000000230 0000000f00000025 R_MIPS_JALR/R_MIPS_NONE/R_MIPS_NONE 00000000000000a0 use_global_int + 0
...
Relocation section '.rela.stack_sizes' at offset 0xfc0 contains 3 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + 0
0000000000000009 0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + a0
0000000000000012 0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + 1e0
Relocation section '.rela.debug_info' at offset 0x1008 contains 29 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000006 0000000700000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_abbrev + 0
000000000000000c 0000000800000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_str + 0
0000000000000012 0000000800000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_str + 67
0000000000000016 0000000900000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_line + 0
000000000000001a 0000000800000002 R_MIPS_32/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .debug_str + 76
000000000000001e 0000000500000012 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE 0000000000000000 .text + 0
Refrences:
PLT (.plt
) and GOT (.got
) are for dynamic linking.
Relocation entry: A relocation entry in a binary is a descriptor which essentially says “determine the value of X, and put that value into the binary at offset Y” – each relocation has a specific type, defined in the ABI documentation, which describes exactly how to “determine the value of X”.
These relocation entries in binaries are left to be filled in later – at link time by the toolchain linker or at runtime by the dynamic linker.
$ cat a.c
extern int foo;
int function(void){
return foo;
}
$ gcc -c test.c -o test.o
$ readelf --relocs ./test.o
Relocation section '.rela.text' at offset 0x1b8 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000006 000900000002 R_X86_64_PC32 0000000000000000 foo - 4
The value of foo
is not known during compilation, so the compiler leaves behind a relocation entry
(of type R_X86_64_PC32
) which is saying
“in the final binary, patch the value at offset 0x6 in this object file with the address of symbol foo
”.
If you take a look at the output file, you can see at offset 0x6 there are 4 bytes of zeroes just waiting for a real address:
objdump --disassemble ./test.o
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <function>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # a <function+0xa>
a: 5d pop %rbp
b: c3 retq
This will be resolved at link time.
For unresolved data or code in a dynamic libraries, or code using dynamic libraries, how to access the data or call the function which is unknown at compilation or linking time yet?
GOT and PLT provides a layer of indirection where dynamic lib loader can query the two to compute the right location:
got.plt
entry store the address of PLT stub);
ld.so
functions to resolve the address;ld.so
, the loader will also save the address to the GOT (.got.plt
), which replaces the PLT stub address..got.plt
) without going through PLT stub.More at
see more on Relocation in LLD.
// lld/ELF/Relocations.cpp
//===- Relocations.cpp ----------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains platform-independent functions to process relocations.
// I'll describe the overview of this file here.
//
// Simple relocations are easy to handle for the linker. For example,
// for R_X86_64_PC64 relocs, the linker just has to fix up locations
// with the relative offsets to the target symbols. It would just be
// reading records from relocation sections and applying them to output.
//
// But not all relocations are that easy to handle. For example, for
// R_386_GOTOFF relocs, the linker has to create new GOT entries for
// symbols if they don't exist, and fix up locations with GOT entry
// offsets from the beginning of GOT section. So there is more than
// fixing addresses in relocation processing.
//
// ELF defines a large number of complex relocations.
//
// The functions in this file analyze relocations and do whatever needs
// to be done. It includes, but not limited to, the following.
//
// - create GOT/PLT entries
// - create new relocations in .dynsym to let the dynamic linker resolve
// them at runtime (since ELF supports dynamic linking, not all
// relocations can be resolved at link-time)
// - create COPY relocs and reserve space in .bss
// - replace expensive relocs (in terms of runtime cost) with cheap ones
// - error out infeasible combinations such as PIC and non-relative relocs
//
// Note that the functions in this file don't actually apply relocations
// because it doesn't know about the output file nor the output file buffer.
// It instead stores Relocation objects to InputSection's Relocations
// vector to let it apply later in InputSection::writeTo.
//
//===----------------------------------------------------------------------===//
reference: Copy Relocations
References: Hardening ELF binaries using Relocation Read-Only(RELRO) RELRO ELF: Executable and Linkable Format. PIE: Position Independent Executables. RELRO: Relocation Read-Only. In dynamic linked ELF: GOT: Global Offset Table. A look-up table, contains pointers that points to the actual location of dynamically resolved functions. Lives in .got.plt section. Located at a static address. Needs to be writable. —> can be overflowed by attackers. dynamically populcated as the program is running: first time GOT points back to PLT(inside a dynamic linker procedure), the dynamic linker finds the actual location, then written to GOT.
If you could revise
the fundmental principles of
computer system design
to improve security...
... what would you change?