Sonic Boom

Instruction Fetch

References:

The Boom Front-end

The Front-end fetches instructions and makes predictions throughout the Fetch stage to redirect the instruction stream in multiple fetch cycles (F0, F1…)

Misprediction:

  • Detected in BOOM’s Back-end(execution pipeline);
  • A request is sent to the Front-end;

ICache:

  • Virtually indexed, physically tagged set-associative cache;
  • To save power, the i-cache is only fired up again once the fetch register has been exhausted (or a branch prediction directs the PC elsewhere).
  • Does not (currently) support fetching across cache-lines, nor does it support fetching unaligned relative to the superscalar fetch address;
  • Does not (currently) support hit-under-miss;
    • If an i-cache miss occurs, the i-cache will not accept any further requests until the miss has been handled.
    • This is less than ideal for scenarios in which the pipeline discovers a branch mispredict and would like to redirect the i-cache to start fetching along the correct path;

Fetch Buffer:

  • Fetch packets from i-cache;
  • parameterizable: num of entries; whether is a ‘flow-throug’ queue or not;

Fetch Target Queue:

  • holds PC received from from the i-cache and branch prediction info associated with that address;
  • referenced during the executions of the pipleline’s Micro-Ops(UOPs);
  • dequeued by the ROB once an instruction is committed and is updated during pipeline redirection/mispeculation;

Branch Prediction

Two levels of prediction:

  • A fast Next-Line Predictor (NLP)

    • a Branch Target Buffer provides single-cycle predictions with a combinational logic;
    • Rocket “BTB” is more complex;
    • an amalgamation of:
    • A fully associative Branch Target Buffer (BTB);
    • Bi-Modal Table (BIM);
      • determine whether a branch taken or not taken;
    • A Return Address Stack (RAS);
  • A slower but more complex Backing Predictor (BPD);

    • like a GShare predictor;
    • predict result has a higher priority than NLP;
NLP Predictions
  • Match a PC tag with a BTB entry;

    • The PC is the “Fetch PC”, not the PC of the branch itself
    • “Fetch PC” is the PC corresponding to the head of a Fetch Packet instruction group;
    • Each BTB entry corresponds to a single Fetch PC, but is helping to predict across an entire Fetch Packet.
      • other designs instead choose to provide a whole bank of BTBs for each possible instruction in the Fetch Packet.
  • On a hit:

    • check BIM, determine whether a branch
    • if return instruction, use RAS to predict return PC;
    • fetch PC target next cycle;
  • for area-efficiency, the high-order bits of the PC tags and PC targets are stored in a compressed file.

  • RAS update:

    • if the taken instruction is a call, the return address is pushed onto the RAS;
    • if the taken instruction is a return, then the RAS is popped;

Backing Predictor (BPD)

To capture more branches and more complicated branching behaviors.

  • Goal is to provide high accuracy in a (hopefully) dense area;

  • Only provides taken/not-taken predictions;

Global History Register (GHR)

todo

More

Created Mar 28, 2022 // Last Updated Feb 22, 2023

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?