Reference
lld xxx.o -o xxx.exe
lld –verbose
LinkerDriver::link<ELFT>
is the driving entry for the link to prepare different sections needed for the final executable.
inputSecions
hold a list of all sections from all object files.
outputSections
// lld/ELF/Driver.cpp
// Do actual linking. Note that when this function is called,
// all linker scripts have already been parsed.
template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
...
// Now that we have a complete list of input files.
// Beyond this point, no new files are added.
// Aggregate all input sections into one place.
for (InputFile *f : objectFiles)
for (InputSectionBase *s : f->getSections()){
if (s && s != &InputSection::discarded){
inputSections.push_back(s);
MSGLL("input section added: '" + s->name
+ "', size: " + std::to_string(s->getSize()) + "\n");
}else{
if (s) MSGLL("empty section disgarded: '" + s->name + "'\n");
}
}
for (BinaryFile *f : binaryFiles)
for (InputSectionBase *s : f->getSections())
inputSections.push_back(cast<InputSection>(s));
llvm::erase_if(inputSections, [](InputSectionBase *s) {
if (s->type == SHT_LLVM_SYMPART) {
readSymbolPartitionSection<ELFT>(s);
MSGLL("input section '" + s->name + "' removed\n");
return true;
}
// We do not want to emit debug sections if --strip-all
// or -strip-debug are given.
if(config->strip != StripPolicy::None &&
(s->name.startswith(".debug") || s->name.startswith(".zdebug")))
{
MSGLL("input section (debug) '" + s->name + "' removed\n");
return true;
}else{
return false;
}
});
// Linker scripts control how input sections are assigned to output sections.
// Input sections that were not handled by scripts are called "orphans", and
// they are assigned to output sections by the default rule. Process that.
script->addOrphanSections();
...
}
LinkerScript::addOrphanSections()
handles sections that are not defined in the LinkerScritps using default rules.
// Add sections that didn't match any sections command.
void LinkerScript::addOrphanSections() {
StringMap<TinyPtrVector<OutputSection *>> map;
std::vector<OutputSection *> v;
auto add = [&](InputSectionBase *s) {
MSGLL("Handling Orphan Section: " + s->name + "\n");
if (!s->isLive() || s->parent)
return;
StringRef name = getOutputSectionName(s);
MSGLL("Orphan section is placed in '" + name + "'\n");
if (config->orphanHandling == OrphanHandlingPolicy::Error)
error(toString(s) + " is being placed in '" + name + "'");
else if (config->orphanHandling == OrphanHandlingPolicy::Warn)
warn(toString(s) + " is being placed in '" + name + "'");
if (OutputSection *sec = findByName(sectionCommands, name)) {
sec->addSection(cast<InputSection>(s));
return;
}
if (OutputSection *os = addInputSec(map, s, name))
v.push_back(os);
assert(s->getOutputSection()->sectionIndex == UINT32_MAX);
};
// For futher --emit-reloc handling code we need target output section
// to be created before we create relocation output section, so we want
// to create target sections first. We do not want priority handling
// for synthetic sections because them are special.
for (InputSectionBase *isec : inputSections) {
if (auto *sec = dyn_cast<InputSection>(isec))
if (InputSectionBase *rel = sec->getRelocatedSection())
if (auto *relIS = dyn_cast_or_null<InputSectionBase>(rel->parent))
add(relIS);
add(isec);
}
// If no SECTIONS command was given, we should insert sections commands
// before others, so that we can handle scripts which refers them,
// for example: "foo = ABSOLUTE(ADDR(.text)));".
// When SECTIONS command is present we just add all orphans to the end.
if (hasSectionsCommand)
sectionCommands.insert(sectionCommands.end(), v.begin(), v.end());
else
sectionCommands.insert(sectionCommands.begin(), v.begin(), v.end());
}
Reference reference A thunk is a code-sequence inserted by the linker in between a caller and the callee. A relocation to the callee is redirected to the Thunk. // lld/ELF/Thunks.h // Class to describe an instance of a Thunk. // A Thunk is a code-sequence inserted by the linker in between a caller and // the callee. The relocation to the callee is redirected to the Thunk, which // after executing transfers control to the callee.
Reference reference Relocation after Address Assignment Each section has a relocate method that can relocate itself. It happens after the dot address resolution, phdrs creation. Relocation per Section Relocation for each relocatable section: // lld/ELF/InputSection.h // class InputSectionBase // Each section knows how to relocate itself. These functions apply // relocations, assuming that Buf points to this section's copy in // the mmap'ed output buffer. template <classELFT> void relocate(uint8_t *buf, uint8_t *bufEnd); void relocateAlloc(uint8_t *buf, uint8_t *bufEnd); static uint64_t getRelocTargetVA(const InputFile *File, RelType Type, relocateNoSym int64_t A, uint64_t P, const Symbol &Sym, RelExpr Expr, InputSectionBase *isec, uint64_t offset); Call path for relocate, relocateAlloc and getRelocTargetVA:
Q&A How to merge the .text sections from two object (relocatable) files into one executable binary? How/When to determine the virtual address of each .text segment? scan and parse each output section commands, update .dot according to each section. see LinkerScript::assignAddresses() called in Writer<ELFT>::finalizeSections() => Writer<ELFT>::finalizeAddressDependentContent() => LinkerScript::assignAddresses(); How/When to update the other sections that related to the relocated .text sections? References: lld/ELF/Driver.
If you could revise
the fundmental principles of
computer system design
to improve security...
... what would you change?