Cheri Source Code reading


TODOs

  • cpu instructions;
  • l2cache;
  • icache;
  • dcache;
  • tagCache; init done; compilation passed.

Q&A

Memory partitioned in ?

  • see MultiLevelTagLookup.bsv.

Possible bugs

Bursts is 8 or 4?

  • From TagController.bsv: peekMemResponse: one tagCache response is used for all bursts (frame as index), but the tagCache response only contain tags for 4 flits data’s tags.
  • But somewhere says burst can be up to 8?

Does dCache/iCache calls the tag controller? If so, how?

No. TagController is below L2Cache, and has an L3-like tag cache, See TagController


Reference 1

Key files

cheri/trunk/ Root of BERI1 source tree

  • boards/ FPGA board projects
  • sw/ integrated software component source code
  • ip/ destination of generated Verilog files

In BERI1 source code:

  • MIPS.bsv: Types and shared functions for the design
  • MIPSTop.bsv: Top-level module implementing instruction and register fetch, which instantiates all other modules
  • Scheduler.bsv: Pre-decode stage of the pipeline
  • Decode.bsv
  • Execute.bsv
  • MemAccess.bsv: Memory access and writeback stage of the pipeline
  • Memory.bsv: Memory subsystem, which instantiates the caches, merging logic, and memory interface
  • ICache.bsv: Instruction level 1 cache
  • DCache.bsv: Data level 1 cache
  • Interconnect.bsv: Package including busses for implementing the memory heirachy
  • L2Cache.bsv
  • CoreCache.bsv: Core cache module used in all caches
  • TopAxi.bsv: Top-level module adapting BERI’s memory interface to an AXI bus interface
  • TopSimAxi.bsv: Top-level module interfacing BERI’s memory interface with the PISM bus for C peripheral models
  • ForwardingPipelineRegFile.bsv: forwarding register file
  • CP0.bsv: Coprocessor 0 containing all configuration registers
  • TLB.bsv: 40-entry TLB with three cached interfaces
  • CapCop.bsv: Module implementing the capability coprocessor

Macros for conditional compiling

  • CAP: include capability coprocessor
  • COP1: Include floating point unit
  • COP3: include experimental CP3
  • DCACHECORE: Use alternative DCache implementation
  • ICACHECORE: Use alternative ICache impl.
  • MULTI: number of cores
  • MICRO: do not include the TLB and L2 cache
  • NOBRANCHPREDICTION: wait for committed branch targets
  • NOT_FLAT: build with all possible synthesis boundaries
  • NOTAG: Bypass tag cache for capabilities (return True???)

Testing

Run test suite

cd cheritest/trunk/ make test

Python Nose Framework

Tagged Memory and Data Paths

Two stages, Memory access and Instruction fetch, to access the memory with tags.

Data: Memory Access

TODO: make clear how data is access from CPU to memory, and how tag is access in this process.

see Memory Access

The data path until TLB:

MemoryAccess.bsv: issue memory request: module mkMemoryAccess: method Action enq() -> m.StartMem(addr, size,...)

-> Memory.bsv: receive memory request from CPU: module mkMIPSMemory: (interface DataMemory) method Action startMem(addr, size,...):

-> Memory.bsv: send request to TLB and get response. 1) prepare CacheRequest{size, TlbResponse{addr,…},…}, TlbRequest{addr,…}, ; 2) send requests and get response from TLB: req.tr <- tlb.tlbLookupData.request(tlbReq);

-> If TLB hit: send request to cache: dCache.put(req);

-> If TLB miss: dCacheFetch <= req; dCacheDelayed <= True;

The Data path within DataCache (after TLB hit):

The Data path when TLB Miss:

query the TLB response again with req.tr <- tlb.tlbLookupData.response();

Then do dCache.put(req), same when we have TLB hit.

rule feedDCache(dCacheDelayed);
    CacheRequestDataT req = dCacheFetch;
    req.tr <- tlb.tlbLookupData.response();
    dCache.put(req);
    dCacheDelayed <= False;
  endrule
  • MIPSTop.bsv
  • Q&A What is the specification of memory interface ? No mater what the capability width is, the MIPS processor has a memory interface of 256-bit data width (WORD width), 35-bit WORD address width. As there are 2^5 = 32 bytes = 256 bits per word, this is equivalent to a 35 + 5 = 40 byte address. ==> Memory interface is not byte addressable, but only world addressable; But processor could address in bytes inside itself.

  • Example CSC
  • Q&A How a store cap via cap instruction got implemented in a processor? ISAv7 CSC: Store Capability via Capability CSC cs,rt,offset(cb) CSCR cs,rt(cb) CSCI cs,offset(cb) Cap register cs is stored at the memory location of cb.base + cb.offset + rt + 16*offset | Bit | size | value | |——-|—-|——–| | 31-26 | 6 | 0x3e | | 25-21 | 5 | cs | | 20-16 | 5 | cb | | 15-11 | 5 | rt | | 10-0 | 11 | offset |

  • Writeback
  • Q&A Who calls this module and what is the input? The final pipeline stage, previous one is MemAccess connected via FIFO of ControlToken. update register file. Where does it go? Reference 1 reference ↩

  • MIPS.bsv
  • Reference 1 reference ↩

  • ICache
  • Todones CacheInstIfc (in MIPS.bsv) - put() -> rule doPut() // - getRead() - invalidate() - getConfig() - getResponse() // - interface Master# Q&A Who calls this module and what is the input? called from Memory.bsv: rule feedICache, after TLB translation. iCache.put(req); The input is the physical memory access request req of type CacheRequestInstT. see Memory.bsv. Where does it go? it send a memory request to CacheCore, iCache.

  • Test
  • Reference 1 Files in test suite cheritest/trunk. root of test suite gxemul_log/ output of gxemul test log/ holds output of test obj/ holds obj files, memory images, and assembly dumps. tests/ Individual tests and their matching Python Nose classes tools/ Utility functions to perform common functions such as interpreting BERI simulator and gxemul output fuzzing/ Scripts for fuzz testing the TLB init.s A thin loader to set up various aspects of CPU and memory configuration.

  • tagsparams.py
  • file cherilibs/trunks/tagsparams.py Parameters: -c, –cap-size, default=256, capability size in bits; -s, –structure, default=[0], list from leaf to root of branching factors describing the tags tree; -t, –top-addr, default=0x4000_0000, memory address the tag should start growing down from; -a, –addr-align, default=32, alignment requirement (in bytes) for table levels addresses; -m, –mem-size, default=(2^32 + 2^20), size of the memory to be covered by the tags; -b, –bsv-inc-output, const=“TagTableStructure.bsv”, default = None; -l, –linker-inc-output, const=“tags-params.

  • Expand Bits
  • Tag Lookup
  • Q&A How many tags it caches for physical memory access request? Overview Get a physical addr access request, and return a tag associated with this physical address. This is a cache impl based on CacheCore, similar to D/ICache, L2Cache. One Covered Region is a cached tag region. It is of type LineTags, which is a vector (of size 4) of tags, each vector element contains the tags for one flit (i.

  • MultiLevelTagLookup.bsv
  • Todones rules/functions processing different states: - Init: - doLookup condition !Init - Idle: - Slave cache.request.canPut(), put(), - doLookup condition !Idle - ReadTag, SetTag, ClearTag: - used in doLookup - set in Slave cache.request.put(req) - FoldZeroes: - used in doLookup - set in doLookup, rules triggered by getReq.send(): -> rule drainMemRsp: // done rule doLookup: -> ClearTags: -> getOldTagsEntry() // done // do nothing if flat table. -> getReq.send() // done -> doTransition() // done Slave: -> request.

  • Master Slave
  • Q&A Master Slave: how do they communicate? ModuleA (master)<—-> (slave) ModuleB (master) <—-> (slave) ModuleC Master controls the communication: do the put/get things by calling slave’s interface method. Slave will be waiting for being called and receive the data; One has Slave interface will also have a Master interface in order to pass this data to the Slave interface of next module. Right? See Bluespec basics for connectable interfaces [/en/arch/basics/bluespec/packages/connectable/client-server/]

  • L2Cache
  • Q&A Who calls it and what is the input? Where does it go? hand over the request to tagControllers Calls tagController to get response How is it related tagged memory? The master interface memory is re-used by TagController file: cherilibs/trunk/L2Cache.bsv Interface L2CacheIfc Master is memory; Slave is cache; mkConnection(l2CacheMemory, tagController.cache); means Master is l2CacheMemory and slave is tagController.cache, that is tag controller is being called by l2Cache; and l2Cache send request and get response from tagController.

  • Merge.bsv
  • Q&A Who calls this and what is the input? i/dcache requests goes through here and being forwarded to l2cache; l2cache response being sent back here and forwarded to i/dcache; Where does this go? forward request to l2cache memory; Memory.bsv: mkConnection(theMemMerge.merged, l2Cache.cache) return response to icache/dcache via slaves[i] file: cheri/trunk/Merge.bsv module mkMergeFast module mkMergeFast(MergeIfc#(numIfc)); numIfc = 2 in Memory.bsv Connections: // Memory.bsv mkConnection(iCache.memory, theMemMerge.slave[0]); mkConnection(dCache.

  • CacheCore.bsv
  • Q&A Does it treat the memory tags read/write request differently? no. tag read/write can only be distinguished by masterID. However, master ID does not change the logic routines in CacheCore. CacheCore only transfer this master ID from request to response. Who calls this and what is the input? DCache calls this to send request and get response via core.put(CheriMemRequest reqIn) and core.response.get init: CacheCore#(4, TSub#(Indices,1), 1) core <- mkCacheCore(cacheId, wmb, RespondAll, InOrder, DCache, zeroExtend(memReqs.

  • CP0
  • Todos tracking tlbLookupData.request/response for TLB hit/miss handling MIPS.bsv: CP0Ifc declaration, contains subinterface of TranslationIfc tlbLookupData; COP0.bsv: mkCP0: definition of TranslationIfc tlbLookupData: .request(reqIn) and .response() invokes tlb.lookup[1/2].request(reqIn) and .response(), which is defined in mkTLB module in TLB.bsv: lookup = lookups. see [../tlb], Do TLB search tracking cache for hit/miss handling Reference 1 File: cheri/trunk/CP0.bsv Module mkCP0 mkCP0#(Bit#(16) coreId)(CP0Ifc) Interfaces interface CP0Ifc, in cheri/trunk/MIPS.bsv: methods: method for register read: readReq method for register writePending bool flag writePending; method for register write: writeReg; … method for reading current address space identifier: getAsid // a method to get current code/data page tags?

  • MemTypes.bsv
  • Data Reference 1 The type of Data#, CapTags, CapsPerFlit, and BytesPerFlit. See CheriBusBytes in MemTypes Data It contains both data and capability tag. TODO: What is the data_width? and the relationship between number of tag bits and this data_width, and the CHERI bits? Data# definition: // Data type typedef struct { `ifdef USECAP // is this frame has capabilities CapTags cap; `endif // actual data Bit#(width) data; } Data#(numeric type width) deriving (Bits, Eq, FShow); CapTags definition

  • Tag Controller
  • Todones peekMemResponse(): when grab the tag response from tagCache: lookupRsp.first: need to review it to match the CheriTagResponse returned in MultiLevelTagLookup.bsv Q&A Who calls this and what is the input? see Input/Output How does it get feedback from memory? via function peekMemResponse(): mRsps.first. See section helper function Where does this being connected? Connect to l2 cache; and connect to memory to provide proxied memory interface.

  • DCache.bsv
  • Todones CacheDataIfc - put() -> rule doPut() // - getResponse() // done. Q&A Who calls this module and what is the input? called from Memory.bsv: interface DataMemory:startMem, after TLB translation. dCache.put(req); The input is the physical memory access request req of type CacheRequestDataT. see Memory.bsv. Where does it go? it send a memory request to CacheCore, see CacheCore. it returns the response back to Memory.

  • TLB.bsv
  • Q&A How does TLB read/write permissions on page table? Can we add more bits for permission/types? Reference 1 2 MIPS R4000 Basics 48 TLB entries, each can map variable-sized pages from 4Kb to 16Mb. Address translation value is tagged with the most-significant bits of its virtual address, and a per-process identifier. Instruction TLB: a two-entry instruction TLB.s Joint TLB: upon TLB miss, software will refill the JTLB from a page table resident in memory.

  • TagCache.bsv
  • Q&A Who calls this and what is the input? ??? No one calls mkTagCache?! Where does this go? Reference 1 cherilibs/trunk/TagCache.bsv reference ↩

  • Memory.bsv
  • Q&A Who calls this and what is the input? Where does it go? it sends a cache memory request to DCache: dCache.put(req);, see DCache.bsv Reference 1 MIPS memory module file cheri/trunk/Memory.bsv module mkMIPSMemory#(Bit#(16) coreId, CP0Ifc tlb)(MIPSMemory); MIPS memory interfaces MIPSMemory DataMemory InstructionMemory MemConfiguration Server#(CoProMemAccess, CoProRegs) // cheri/trunk/Memory.bsv interface MIPSMemory; interface DataMemory dataMemory; interface InstructionMemory instructionMemory; interface MemConfiguration configuration; `ifdef COP1 interface Server#(CoProMemAccess, CoProReg) cop1Memory; `endif `ifdef MULTI method Action invalidateICache(PhyAddress addr); method Action invalidateDCache(PhyAddress addr); method ActionValue#(Bool) getInvalidateDone; interface Master#(CheriMemRequest, CheriMemResponse) dmemory; interface Master#(CheriMemRequest, CheriMemResponse) imemory; `else interface Master#(CheriMemRequest, CheriMemResponse) memory; // the generic main memory interface as a client.

  • Execute.bsv
  • Q&A How does it caculate the memory access address? where does it access TLB? Global Functions: function Bit#(a) arithmeticShift(Bit#(a) toShift, Bit#(b) shiftAmount) function Bit#(a) arithmeticShift2(Bit#(a) toShift, Bit#(b) shiftAmount) mkExecute module s Input: MIPSRegFileIfc rf WritebackIfc writeback CP0Ifc cp0 CoProIfc cop1 CapCopIfc capCop FIFO#(ControlTokenT) inQ Rules: finishMultiplyOrDivide deliverPendingOp Methods: enq first deq clear States: FIFO#(ControlTokenT) outQ <- mkFIFO; MulDivIfc mul <- mkMulDiv; Reg#(MIPSReg) hi <- mkReg(64’b0); Reg#(MIPSReg) lo <- mkReg(64’b0); FIFOF#(Bool) hiLoPending <- mkFIFOF1; FIFOF#(ControlTokenT) pendingOps <- mkFIFOF1; Reg#(Bit#(16)) coreid <- mkConfigReg(0); method Action enq(ControlTokenT di) Condition:

  • CoProConversionFunctions.bsv
  • // cheri/trunk/FPU/CoProFPConversionFunctions.bsv function Bit#(m) truncateLSB(Bit#(n) value); return value[valueOf(n)-1:valueOf(n)-valueOf(m)]; endfunction

  • Scheduler.bsv
  • Q&A Who calls this module and what is the input? Where does it go? Reference 1 //cheri/trunk/Scheduler.bsv // The mkScheduler module does a "pre-decode" of the instruction to find which // register numbers may be fetched and to classify the branch behaviour of the // instruction for the branch predictor. module mkScheduler#( // The scheduler needs the branch interface so that it can report the branch type // for the next prediction.

  • MemAccess.bsv
  • Reference 1 Memory access stage of the pipeline //cheri/trunk/MemAccess.bsv module mkMemAccess#( DataMemory m `ifdef USECAP , CapCopIfc capCop `endif )(PipeStageIfc); Overview Input: DataMemory m. The memory hierachy which needs the system control processor for TLB integration CapCopIfc capCop. Methods: enq (ControlTokenT er) first deq clear // should never be called method Action enq(ControlTokenT er) Input: ControlTokenT er Output: ControlTokenT mi. // outQ.enq(mi) // mi has the updated

  • CapCop.bsv
  • Reference 1 CapCop.bsv cheri/trunk/CapCop.bsv Overview Interface: CapCopIfc Module: mkCapCop#(Bit#(16) coreId) (CapCopIfc) Functions: getBase(cap), getLength(cap), getOffset(cap), getType(cap), getSealed(cap), checkRegAccess(Perms, CapReg), privileged(Perms), getPerms(CapFat). mkCapCop States: Reg#(Capability) pcc <- mkConfigReg(defaultCap); FIFOF#((BufferedPCC)) pccUpdate <- mkUGFIFOF1(); FIFO#(CapControlToken) inQ FIFO#(CapControlToken) dec2exeQ FIFO#(CapControlToken) exe2memQ FIFO#(CapControlToken) mem2wbkQ FIFOF#(ExceptionEvent) exception Reg#(CapCause) causeReg FIFOF#(CapCause) causeUpdate FIFO#(LenCheck) lenChecks FIFO#(CapCause) lenCause Reg#(Bool) capBranchDelay Reg#(CapState) capState Reg#(UInt#(5)) count Rules: initialize: regFile.writeRaw for 32 times and set capState from Init to Ready.


  1. BERI HW reference. 2015. ↩
Created Mar 31, 2020 // Last Updated May 30, 2020

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?