References:
CFI (Call Frame Information) directives
.cfi_personality
directives, etc..eh_frame
section. Format is similar to .debug_frame
section specified by DWARF standard..eh_frame
and .eh_framehdr
describes the call frames that must be unwound during the exception.DWARF Debugging Information Format
.debug_frame/info/loc/
section.ELF Data Types
(Elf32_Half, Elf32_Off, Elf32_Addr, Elf32_Word, Elf32_Sword)
ELF file has many headers, but only one header has fixed placement: the ELF header, present at the beginning of every file.
ELF header provides information about the file, such as the machine type, architecture and byte order, etc. as well as a means of identifying and checking whether the file is valid; also provides information about other sections in the file.
// ?
# define ELF_NIDENT 16
typedef struct {
uint8_t e_ident[ELF_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shstrndx;
} Elf32_Ehdr;
ELF file contains a lot of different types of section and their relavant headers, not all of them are present in the every file, and there is no guarantee on which order they appear in. Thus, in order to parse and process these sections, the ELF format also defines section headers, which contain information such as section names, sizes, locations and other relevant information. The list of all the section headers in an ELF image is referred to as the section header table.
// https://github.com/freebsd/freebsd/blob/master/sys/sys/elf32.h
/*
* Section header.
*/
typedef struct {
Elf32_Word sh_name; /* Section name (index into the
section header string table). */
Elf32_Word sh_type; /* Section type. */
Elf32_Word sh_flags; /* Section flags. */
Elf32_Addr sh_addr; /* Address in memory image. */
Elf32_Off sh_offset; /* Offset in file. */
Elf32_Word sh_size; /* Size in bytes. */
Elf32_Word sh_link; /* Index of a related section. */
Elf32_Word sh_info; /* Depends on section type. */
Elf32_Word sh_addralign; /* Alignment in bytes. */
Elf32_Word sh_entsize; /* Size of each entry in section. */
} Elf32_Shdr;
sh_name
does not point directly to a string. Instead it points to the offset of a string in the section name string table.
The index of the table itself is defined in the ELF header by the field e_shstrndx
.
sh_addr
???
sh_offset
is the position in the ELF image file, as an offset from the beginning of the file.
sh_type
stores the type of the section. The value is of the the enum ShT_Type
(see below).
//
enum ShT_Types {
SHT_NULL = 0, // Null section
SHT_PROGBITS = 1, // Program information
SHT_SYMTAB = 2, // Symbol table
SHT_STRTAB = 3, // String table
SHT_RELA = 4, // Relocation (w/ addend)
SHT_HASH = 5,
SHT_DYNAMIC = 6,
SHT_NOTE = 7,
SHT_NOBITS = 8, // Not present in file
SHT_REL = 9, // Relocation (no addend)
SHT_SHLIB = 10,
SHT_DYNSYM = 11,
SHT_LOPROC = 0x70000000,
SHT_HIPROC = 0x7fffffff, // Values in this inclusive range are reserved for processor-specific semantics
SHT_LOUSER = 0x80000000, // the lower bound of the range of indexes reserved for
application programs.
SHT_HIUSER = 0xffffffff, // Section types between SHT_LOUSER and
SHT_HIUSER may be used by the application, without conflicting with
current or future system-defined section types.
};
A full list of section types in LLVM:
// https://github.com/llvm/llvm-project/blob/13b63be472233762024ef196dd88612369a51807/llvm/include/llvm/BinaryFormat/ELF.h#L819
// Section types.
enum : unsigned {
SHT_NULL = 0, // No associated section (inactive entry).
SHT_PROGBITS = 1, // Program-defined contents.
SHT_SYMTAB = 2, // Symbol table.
SHT_STRTAB = 3, // String table.
SHT_RELA = 4, // Relocation entries; explicit addends.
SHT_HASH = 5, // Symbol hash table.
SHT_DYNAMIC = 6, // Information for dynamic linking.
SHT_NOTE = 7, // Information about the file.
SHT_NOBITS = 8, // Data occupies no space in the file.
SHT_REL = 9, // Relocation entries; no explicit addends.
SHT_SHLIB = 10, // Reserved.
SHT_DYNSYM = 11, // Symbol table.
SHT_INIT_ARRAY = 14, // Pointers to initialization functions.
SHT_FINI_ARRAY = 15, // Pointers to termination functions.
SHT_PREINIT_ARRAY = 16, // Pointers to pre-init functions.
SHT_GROUP = 17, // Section group.
SHT_SYMTAB_SHNDX = 18, // Indices for SHN_XINDEX entries.
// Experimental support for SHT_RELR sections. For details, see proposal
// at https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
SHT_RELR = 19, // Relocation entries; only offsets.
SHT_LOOS = 0x60000000, // Lowest operating system-specific type.
// Android packed relocation section types.
// https://android.googlesource.com/platform/bionic/+/6f12bfece5dcc01325e0abba56a46b1bcf991c69/tools/relocation_packer/src/elf_file.cc#37
SHT_ANDROID_REL = 0x60000001,
SHT_ANDROID_RELA = 0x60000002,
SHT_LLVM_ODRTAB = 0x6fff4c00, // LLVM ODR table.
SHT_LLVM_LINKER_OPTIONS = 0x6fff4c01, // LLVM Linker Options.
SHT_LLVM_CALL_GRAPH_PROFILE = 0x6fff4c02, // LLVM Call Graph Profile.
SHT_LLVM_ADDRSIG = 0x6fff4c03, // List of address-significant symbols
// for safe ICF.
SHT_LLVM_DEPENDENT_LIBRARIES =
0x6fff4c04, // LLVM Dependent Library Specifiers.
SHT_LLVM_SYMPART = 0x6fff4c05, // Symbol partition specification.
SHT_LLVM_PART_EHDR = 0x6fff4c06, // ELF header for loadable partition.
SHT_LLVM_PART_PHDR = 0x6fff4c07, // Phdrs for loadable partition.
// Capsule Ownership Section Type
SHT_CAPSULE_OWNERSHIP = 0x6fff4d00, // capsule ownership section
// Android's experimental support for SHT_RELR sections.
// https://android.googlesource.com/platform/bionic/+/b7feec74547f84559a1467aca02708ff61346d2a/libc/include/elf.h#512
SHT_ANDROID_RELR = 0x6fffff00, // Relocation entries; only offsets.
SHT_GNU_ATTRIBUTES = 0x6ffffff5, // Object attributes.
SHT_GNU_HASH = 0x6ffffff6, // GNU-style hash table.
SHT_GNU_verdef = 0x6ffffffd, // GNU version definitions.
SHT_GNU_verneed = 0x6ffffffe, // GNU version references.
SHT_GNU_versym = 0x6fffffff, // GNU symbol versions table.
SHT_HIOS = 0x6fffffff, // Highest operating system-specific type.
SHT_LOPROC = 0x70000000, // Lowest processor arch-specific type.
// Fixme: All this is duplicated in MCSectionELF. Why??
// Exception Index table
SHT_ARM_EXIDX = 0x70000001U,
// BPABI DLL dynamic linking pre-emption map
SHT_ARM_PREEMPTMAP = 0x70000002U,
// Object file compatibility attributes
SHT_ARM_ATTRIBUTES = 0x70000003U,
SHT_ARM_DEBUGOVERLAY = 0x70000004U,
SHT_ARM_OVERLAYSECTION = 0x70000005U,
SHT_HEX_ORDERED = 0x70000000, // Link editor is to sort the entries in
// this section based on their sizes
SHT_X86_64_UNWIND = 0x70000001, // Unwind information
SHT_MIPS_REGINFO = 0x70000006, // Register usage information
SHT_MIPS_OPTIONS = 0x7000000d, // General options
SHT_MIPS_DWARF = 0x7000001e, // DWARF debugging section.
SHT_MIPS_ABIFLAGS = 0x7000002a, // ABI information.
SHT_MSP430_ATTRIBUTES = 0x70000003U,
SHT_RISCV_ATTRIBUTES = 0x70000003U,
SHT_HIPROC = 0x7fffffff, // Highest processor arch-specific type.
SHT_LOUSER = 0x80000000, // Lowest type reserved for applications.
SHT_HIUSER = 0xffffffff // Highest type reserved for applications.
};
sh_addr
: If the section will appear in the memory image of a process, this member
gives the address at which the section’s first byte should reside. Otherwise,
the member contains 0.
sh_link
: This member holds a section header table index link, whose interpretation
depends on the section type.
sh_info
: This member holds extra information, whose interpretation depends on the
section type.
Note: PROGBITS will have not sh_link and sh_info by default.
Another table from Oracle book:
Table 3-3 Section Types
Name |
Value |
Description |
Interpretation by |
|
---|---|---|---|---|
sh_info |
sh_link |
|||
SHT_NULL |
0 |
Marks section header as inactive; file has no corresponding section. |
0 |
SHN_UNDEF |
SHT_PROGBITS |
1 |
Contains information defined by the program, and in a format and with a meaning determined solely by the program. |
0 |
SHN_UNDEF |
SHT_SYMTAB |
2 |
Is a complete symbol table, usually for link editing. This table can also be used for dynamic linking; however, it can contain many unnecessary symbols. Note: Only one section of this type is allowed in a file |
One greater than the symbol table index of the last local symbol. |
The section header index of the associated string table. |
SHT_STRTAB |
3 |
Is a string table. A file can have multiple string table sections. |
0 |
SHN_UNDEF |
SHT_RELA |
4 |
Contains relocation entries with explicit addends. A file can have multiple relocation sections. |
The section header index of the section to where the relocation applies. |
The section header index of the associated symbol table. |
SHT_HASH |
5 |
Is a symbol rehash table. Note: Only one section of this type is allowed in a file |
0 |
The section header index of the symbol table to which the hash table applies. |
SHT_DYNAMIC |
6 |
Contains dynamic linking information. Note: Only one section of this type is allowed in a file |
0 |
The section header index of the string table used by entries in the section. |
SHT_NOTE |
7 |
Contains information that marks the file. |
0 |
SHN_UNDEF |
SHT_NOBITS |
8 |
Contains information defined by the program, and in a format and with a meaning determined by the program. However, a section of this type occupies no space in the file, but the section header’s offset field specifies the location at which the section would have begun if it did occupy space within the file. |
0 |
SHN_UNDEF |
SHT_REL |
9 |
Contains relocation entries without explicit addends. A file can have multiple relocation sections. |
The section header index of the section to where the relocation applies. |
The section header index of the associated symbol table. |
SHT_SHLIB |
10 |
Reserved. |
0 |
SHN_UNDEF |
SHT_DYNSYM |
11 |
Is a symbol table with a minimal set of symbols for dynamic linking. Note: Only one section of this type is allowed in a file |
One greater than the symbol table index of the last local symbol. |
The section header index of the associated string table. |
SHT_LOPROC SHT_HIPROC |
0x70000000 0x7fffffff |
Lower and upper bounds of range of section types reserved for processor-specific semantics. |
0 |
SHN_UNDEF |
|
||||
SHT_LOUSER SHT_HIUSER |
0x80000000 0xffffffff |
Lower and upper bounds of range of section types reserved for application programs. Note: Section types in this range can be used by an application without conflicting with system-defined section types. |
0 |
SHN_UNDEF |
sh_entsize
: Some sections hold a table of fixed-size entries, such as a symbol table. For
such a section, this member gives the size in bytes of each entry. The
member contains 0 if the section does not hold a table of fixed-size entries.
sh_flag
stores bit flags to describe the section attributes.
A list of section flags/attributes enabled in sh_flag
in LLVM:
// https://github.com/llvm/llvm-project/blob/13b63be472233762024ef196dd88612369a51807/llvm/include/llvm/BinaryFormat/ELF.h#L892
// Section flags.
enum : unsigned {
// Section data should be writable during execution.
SHF_WRITE = 0x1,
// Section occupies memory during program execution.
SHF_ALLOC = 0x2,
// Section contains executable machine instructions.
SHF_EXECINSTR = 0x4,
// The data in this section may be merged.
SHF_MERGE = 0x10,
// The data in this section is null-terminated strings.
SHF_STRINGS = 0x20,
// A field in this section holds a section header table index.
SHF_INFO_LINK = 0x40U,
// Adds special ordering requirements for link editors.
SHF_LINK_ORDER = 0x80U,
// This section requires special OS-specific processing to avoid incorrect
// behavior.
SHF_OS_NONCONFORMING = 0x100U,
// This section is a member of a section group.
SHF_GROUP = 0x200U,
// This section holds Thread-Local Storage.
SHF_TLS = 0x400U,
// Identifies a section containing compressed data.
SHF_COMPRESSED = 0x800U,
// This section is excluded from the final executable or shared library.
SHF_EXCLUDE = 0x80000000U,
// Start of target-specific flags.
SHF_MASKOS = 0x0ff00000,
// Bits indicating processor-specific flags.
SHF_MASKPROC = 0xf0000000,
/// All sections with the "d" flag are grouped together by the linker to form
/// the data section and the dp register is set to the start of the section by
/// the boot code.
XCORE_SHF_DP_SECTION = 0x10000000,
/// All sections with the "c" flag are grouped together by the linker to form
/// the constant pool and the cp register is set to the start of the constant
/// pool by the boot code.
XCORE_SHF_CP_SECTION = 0x20000000,
// If an object file section does not have this flag set, then it may not hold
// more than 2GB and can be freely referred to in objects using smaller code
// models. Otherwise, only objects using larger code models can refer to them.
// For example, a medium code model object can refer to data in a section that
// sets this flag besides being able to refer to data in a section that does
// not set it; likewise, a small code model object can refer only to code in a
// section that does not set this flag.
SHF_X86_64_LARGE = 0x10000000,
// All sections with the GPREL flag are grouped into a global data area
// for faster accesses
SHF_HEX_GPREL = 0x10000000,
// Section contains text/data which may be replicated in other sections.
// Linker must retain only one copy.
SHF_MIPS_NODUPES = 0x01000000,
// Linker must generate implicit hidden weak names.
SHF_MIPS_NAMES = 0x02000000,
// Section data local to process.
SHF_MIPS_LOCAL = 0x04000000,
// Do not strip this section.
SHF_MIPS_NOSTRIP = 0x08000000,
// Section must be part of global data area.
SHF_MIPS_GPREL = 0x10000000,
// This section should be merged.
SHF_MIPS_MERGE = 0x20000000,
// Address size to be inferred from section entry size.
SHF_MIPS_ADDR = 0x40000000,
// Section data is string data by default.
SHF_MIPS_STRING = 0x80000000,
// Make code section unreadable when in execute-only mode
SHF_ARM_PURECODE = 0x20000000
};
To access section header:
e_shoff
in ELF header gives the offset of first section header (NULL).e_shnum
in ELF header gives the total num of section headers in the file.Section headers are continuous. Given pointer to the first entry, subsequent entries can be accessed with simple pointer arithmetic or array operations.
// ?
static inline Elf32_Shdr *elf_sheader(Elf32_Ehdr *hdr) {
return (Elf32_Shdr *)((int)hdr + hdr->e_shoff);
}
static inline Elf32_Shdr *elf_section(Elf32_Ehdr *hdr, int idx) {
return &elf_sheader(hdr)[idx];
}
A program header defines information about how the ELF program behaves once it’s been loaded, as well as runtime linking information.
Files used to build a process image (execute a program) must have a program header table; relocatabe files do not need one.
ELF program headers (much like section headers) are all grouped together to make up the program header table.
// ?
typedef struct {
Elf32_Word p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
Elf32_Word p_filesz;
Elf32_Word p_memsz;
Elf32_Word p_flags;
Elf32_Word p_align;
} Elf32_Phdr;
Reference: Program Header (Linker and Libraries Guide)
“An excutable or shared object file’s program header table is an array of structures, each describing a segment or other information that the system needs to prepare the program for execution. An object file segment contains one or more sections”.
Segment Contents: Text segments contain read-only instructions and data. Data segments contain writable data and instructions. See more about Sengment Contenst.
A PT_DYNAMIC
program header element points at the .dynamic
section. The .got
and .plt
sections also hold information related to position-independent code and dynamic linking.
The .plt
can reside in a text or a data segment, depending on the processor. See processor specific GOT, and Processor specific PLT.
The .bss
section has the type SHT_NOBITS
. Normally, these uninitialized data reside at the the end of the segment, thereby making p_memsz
larger than p_filesz
in the associated program header element.
p_type
. The kind of segment this array element describes or how to interpret the array element’s information. Example types:
PT_NULL
, 0.PT_LOAD
, 1. A loadable segment. Described by p_filesz
and p_memsz
. The bytes from the file are mapped to the beginning of the memory segment. If the segment’s memory size (p_memsz
) is larger than the file size (p_filesz
), the extra bytes are defined to hold the value 0 and to follow the segment’s initialized area. The file size cannot be larger than the memory size. Loadable segment entries in the program header table appear in ascending order, sorted on the p_vaddr
member.PT_DYNAMIC
, 2.PT_INTERP
, 3.PT_NOTE
, 4.PT_SHLIB
, 5.PT_PHDR
, 6.PT_LOSUNW
, 0x6fff.fffa.p_offset
. The offset from the beginning of the file at which the first byte of the segment resides.p_vaddr
. The virtual address at which the first byte of the segment resides in the memory.p_paddr
. The segment’s physical address for systems in which physical addressing is relevant. Because the system ignores physical addressing for applicatin programs, this member has unspecified contents for executable files and shared objects.p_filesz
.p_memsz
.p_flags
. Flags relavant to the segment. Examples:
PF_X
, 0x1, ExecutePF_W
, 0x2, WritePF_R
, 0x4, ReadPF_MASKPROC
, 0xf000,0000. Unspecified.p_align
. a positive, integral power of 2. p_vaddr % p_align = p_offset % p_align
.Various sections in ELF are pre-defined asn hold program and control information. These Sections are used by the operating system and have different types and attributes for different operating systems.
Section names with a dot .
prefix are reserved for the system. Applications may use names without the prefix to avoid conflicts with system sections.
An object file may have more than one section with the same name.
Executables are created from individual object files and libraries through the linking process. The linker’s tasks include:
The linking and loading processes require information defined in the object files and store this information in specific sections such as .dynamic
.
There are also sections for program control, including .bss
, .data
, .data1
, .rodata
, and .rodata1
, and sections for debugging, such as .debug
, .line
, etc.
A list of special sections for the ELF specification:
Symbol table is a section (or a number of sections) that defines the location, type, visibility and other traits of various symbols declared in the original source, created during compilation or linking, or otherwise present in the file.
More info Symbol Table
A number of consecutive zero-terminated strings.
The object file use these strings to represent symbol and section names.
.strtab
, the default string table.
.shstrtab
, the section string table.
.dynstr
, the string table for dynamic linking.
Anytime the loading process needs access to a string, it uses an offset into one of the string tables.
sh_size
specifies the size of the string table in the corresponding section header entry.
The simplest program loader may copy all string tables into memory,
but a more complete solution would omit any that are not necessary during runtime.
Notably those not flagged with SHF_ALLOC
in their respective section header (such as .shstrtab
, since section names aren’t used in program runtime).
.bss
: a block of memory which has been zeroed. (global vars haven’t been init or init to 0 or null).
Type (sh_type
) is SHT_NOBITS
, which means not present in the object file space, but must be allocated during runtime.
BSS should be allocated before performing any operation that relies on relative addressing (such as relocation), as failing to do so can cause code to reference garbage memory or fault.
Any section that is of type SHT_NOBITS
and has the attribute SHF_ALLOC
should be allocated early on duing program loading.
// ?
static int elf_load_stage1(Elf32_Ehdr *hdr) {
Elf32_Shdr *shdr = elf_sheader(hdr);
unsigned int i;
// Iterate over section headers
for(i = 0; i < hdr->e_shnum; i++) {
Elf32_Shdr *section = &shdr[i];
// If the section isn't present in the file
if(section->sh_type == SHT_NOBITS) {
// Skip if it the section is empty
if(!section->sh_size) continue;
// If the section should appear in memory
if(section->sh_flags & SHF_ALLOC) {
// Allocate and zero some memory
void *mem = malloc(section->sh_size);
memset(mem, 0, section->sh_size);
// Assign the memory offset to the section offset
section->sh_offset = (int)mem - (int)hdr;
DEBUG("Allocated memory for a section (%ld).\n", section->sh_size);
}
}
}
return 0;
}
Position independent code.
A relocation section is a table of relocation entries.
Two types of relocation section entry:
SHT_RELA
, relocation with explicit addend;
SHT_REL
, relocation without explicit addend.
A given relocation section only have one type of entry.
// ?
typedef struct {
Elf32_Addr r_offset;
Elf32_Word r_info;
} Elf32_Rel;
typedef struct {
Elf32_Addr r_offset;
Elf32_Word r_info;
Elf32_Sword r_addend;
} Elf32_Rela;
r_info
upper byte points to a symbol in the symbol table, meaning to which the relocation applies; lower byte stores the type of relocation.sh_link
in the relocation section header stores the index of the symbol table section header.
Num of entries = section size sh_size
/ entry size sh_entsize
Each relocation table is specific to a single section.
.ctor/.dtor
section stores the addresses of global constructor and destructors.
Global constructors are supposed to have run before your main function.
The section is a table of pointers, and each pointer is a function that must be executed as global constructor/deconstructors.
More at CTOR/DTOR
Reference How programs get run: ELF binaries How programs get run: execve() system calls Linux src: fs/binfmt_elf.c load_elf_binary() load_elf_phdrs(), load the program headers load_elf_interp(), load_elf_library(), ? elf_core_dump() ELF文件的加载过程(load_elf_binary函数详解)–Linux
References: ELF specification Computer Systems: A Programmer’s Perspective, Chapter 7.7 PLT and GOT - the key to code sharing and dynamic libraries GOT and PLT for pwning Relocation is the process of connecting symbolic references with symbolic definitions. For example, when a program calls a function, the associated call instruction must transfer control to the proper destination address at execution. In other words, relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process’s program image.
Reference reference
References: DWARF Debugging Information Format .debug_frame section. DWARF in LLVM How debuggers work: Part 3 - Debugging information Machine code -> source code file, function name, and line numbers DWARF sections .debug sections, and all the sections begin with .debug: .debug_info .debug_loc .debug_frame … DWARF Format DWARF: Debugging Information Entry(DIE). Each DIE has a tag – its type, and a set of attributes.
Q&A How does LLVM generate Symbol Table section in an object file? References: ELF specification Computer Systems: A Programmer’s Perspective, Chapter 7.6 Symbol Table Section The section .symtab holds a symbol table. The object file use the symbol table to locate and relocate a program’s symbolic definitions and references. First entry is always undefined symbol. If a file has a loadable segment that includes the symbol table, this symbol section’s attributes will include the SHF_ALLOC bit; otherwise the bit will be off.
Reference 1 Calling Global Constructors – CTOR/DTOR ↩
If you could revise
the fundmental principles of
computer system design
to improve security...
... what would you change?