Linkers Loaders

References:

Linker/Loader: binds more abstract names to more concrete names.

Example:

getline –> “the location 612 bytes from the beginning of the executable code in module iosys”.

“the location 450 bytes beyond the beginning of the static data from this module” –> numberic address.

History of address binding

Overlays by linker: different part of a program to share the same memory, with each overlay loaded on demand when another part of the program called into it.

Overlays faded as virtual memory spreaded.

Assembler

Object file

section description
Header size and positions
Text Segment instructions
Data Segment static data: local/globals, strings, constants
Debugging Information line -> code
Symbol Table external (exported) & unresolve (imported) refs

objdump --disassemble/-d --syms/-t

Handling forward references

  • Can be in two passes.

    • pass 1: scan whole program, allocate instructions and lay out data, determine addresses;
    • pass 2: emitting instructions and data, with determined label offsets.
  • Can also in one pass:

    • emitting instructions. Emit a 0 for jumps to labels not yet determined, keep track of where these instructions are;
    • Backpatch. Fill in 0 offsets as labels are defined.

Handling external references

Output object files:

  • binary machine code, but not executable
  • may refer to external symbols
  • each object has its own address space

Linkers: binding some symbols to a relative addresses inside a program;

Loaders: binding symbols to actual address (non-relative).

Static Linker

  • Assign addresses to everything: determine the starting address of a subroutine and update all relative addresses.
  • A linked excutable contains code to initialize memory prior to running a program: the subroutines all have addresses, the data all has addresses, the subroutines know about each other and the data they use, and there are instructions for the loader.

Dynamic linking

Dynamic linking, or resolving an address for a procedure call can happen:

  • upon program start.
  • upon the procedure is called for the first time.

For shared libraries:

  • when linking, linker does not link anything into the program, instead, it make a note in the output file the names of the libraries in which the symbols were found, so that when program is loaded, the shared libary can be bound in.

Loader

Reads an executable and runs the program: setting up memory, as well as re-doing the linker’s job for some dynamic libraries.

Dynamic libararies are linked when you run the program instead of when you compile the program.

Loaders can map a shared library into the same physical address but with different virtual addresses for different applications that use this library, to save physical memory space.

Run-time loading: link the program to the loader itself and invoker loader’s “load this subroutine from this dynamic library” subroutine as it runs. The mechanics of such run-time loading are the same as execution-time loading. Benefit: can react to missing libs; Penalty: lib is not listed in the binary, thus hard to tell in advance what lib are needed. Common in Windows, almost universal in OS X, and unusual in Linux. Why???

  • Linker Scripts
  • Reference SECTIONS Command Output Section description: section [address] [(type)] : [AT(lma)] [ALIGN(section_align) | ALIGN_WITH_INPUT] [SUBALIGN(subsection_align)] [constraint] { output-section-command output-section-command ... } [>region] [AT>lma_region] [:phdr :phdr ...] [=fillexp] [,] VMA and LMA Every section has a virtual memory address (VMA) and a load memory address (LMA), see baseic script concepts. The address in a linker script is virtual address (VMA). This address is optional, but if it is provided then the output address will be set exactly as specified.

Created Jan 28, 2020 // Last Updated May 18, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?