Boot

x86: BIOS -> boot0 -> boot1 -> BTX -> boot2 ->

BIOS

  • early hw init, POST
  • MBR(boot0) loaded from absolute disk sector to 0x7c00

boot0 stage

Master Boot Record (MBR)

Note:

  • The first piece of code under FreeBSD control.
  • Must fit into 512 bytes.
  • boot0.S is assembled “as is”: one-to-one asm to binary; no ELF format; see sys/boot/i386/boo0/Makefile or stand/i386/boo0/boot0.S, the special LDFLAG to strip out any file formating (-oformat,binary)

Function: Scan the partition table and let the user choose which partition to boot from.

  • relocate itself from 0x7c00 to the location it was linked to execute (0x600)
  • load the first disk sector from the FreeBSD slice to address 0x7c00

boot1 stage

sys/boot/i386/boot2/boot1.S or stand/i386/boot2/boot1.S

Note:

  • First of 3 booting stages;
  • boot1 is 512 bytes; boot2 is much larger.
  • boot1 and boot2 (which contains Boot Extender, BTX, and boot2 client) are both included in a single file /boot/boot on disk (cat boot1 boot2 > boot)
  • BTX execute in kernel mode; boot2 client execute in user mode.

Function: load the next boot stage: a server called BTX, a client called boot2.

  • start -> main
  • relocate itself from 0x7c00 to 0x700, but does not jump to 0x700. (It was linked to execute at 0x7c00).
  • rescan the partition table to find where the FreeBSD slice starts(Rereads the MBR).
  • load fist 16 sectors (the all-in-one boot file, contains boot1, boot2, and BTX) to 0x8c00; BTX loaded to 0x9000; (execution at main.5),
  • enable access to memory above 1MB
  • jump to the BTX server

BTX server (x86)

cite: https://www.freebsd.org/doc/en_US.ISO8859-1/books/arch-handbook/btx-server.html

sys/boot/i386/btx/btx.S

Function:

  • switches from 16-bit real mode to 32-bit protected mode
    • modifies interrupt vector table (IVT), providing exception and interrupt handlers for real-mode code;
    • create the Interrupt Descriptor Table(IDT), providing exception and interrupt handlers for protected-mode code:
    • processor exceptions
    • hardware interrupts
    • two system calls
    • V86 interfaces
    • Create a Task-State Segment (TSS), holding information about a task for context-switching between the user mode boot2 client and kernel mode BTX server;
    • Set up Global Descriptor Table(GDT). Entries are provided for supervisor code and data, user code and data, and real-mode code and data.
  • passing control to the client boot2

    /* stand/i386/btx/btx/btx.S 
    */
    
    start:						# Start of code
    /*
    * BTX header.
    */
    btx_hdr:	.byte 0xeb			# Machine ID
    		.byte 0xe			# Header size
    		.ascii "BTX"			# Magic
    		.byte 0x1			# Major version
    		.byte 0x2			# Minor version
    		.byte BTX_FLAGS			# Flags
    		.word PAG_CNT-MEM_ORG>>0xc	# Paging control
    		.word break-start		# Text size
    		.long 0x0			# Entry address

The last field of the header is entry point to the client boot2. This field is patched at link time.

Next is the BTX entry point:

/* stand/i386/btx/btx/btx.S 
*/

/*
 * Initialization routine.
 */
init:		cli				# Disable interrupts
		xor %ax,%ax			# Zero/segment
		mov %ax,%ss			# Set up
		mov $0x1800,%sp		#  stack
		mov %ax,%es			# Address
		mov %ax,%ds			#  data
		pushl $0x2			# Clear
		popfl				#  flags

It

  • disables interrupts,
  • set up a working stack (starting at address 0x1800), and
  • clears the flags in the EFLAGS register. (EFLAGS = 0x2 is clearing because IA-32 requires bit 2 always be 1)

Next is the memory initialization code:

/* stand/i386/btx/btx/btx.S 
*/

/*
 * Initialize memory.
 */
		mov $0x5e00,%di		# Memory to initialize
		mov $(0x9000-0x5e00)/2,%cx	# Words to zero
		rep				# Zero-fill
		stosw				#  memory

It clears memory range 0x5e00 - 0x8fff.

Then the real-mode IVT is updated:

/* stand/i386/btx/btx/btx.S 
*/

/*
 * Update real mode IDT for reflecting hardware interrupts.
 */
		mov $intr20,%bx			# Address first handler
		mov $0x10,%cx			# Number of handlers
		mov $0x20*4,%di			# First real mode IDT entry
init.0:		mov %bx,(%di)			# Store IP
		inc %di				# Address next
		inc %di				#  entry
		stosw				# Store CS
		add $4,%bx			# Next handler
		loop init.0			# Next IRQ

Next creating the IDT:

/* stand/i386/btx/btx/btx.S 
*/

/*
 * Create IDT.
 */
		mov $0x5e00,%di			# IDT's address
		mov $idtctl,%si			# Control string
init.1:		lodsb				# Get entry
		cbw				#  count
		xchg %ax,%cx			#  as word
		jcxz init.4			# If done
		lodsb				# Get segment
		xchg %ax,%dx			#  P:DPL:type
		lodsw				# Get control
		xchg %ax,%bx			#  set
		lodsw				# Get handler offset
		mov $SEL_SCODE,%dh		# Segment selector
init.2:		shr %bx				# Handle this int?
		jnc init.3			# No
		mov %ax,(%di)			# Set handler offset
		mov %dh,0x2(%di)		#  and selector
		mov %dl,0x5(%di)		# Set P:DPL:type
		add $0x4,%ax			# Next handler
init.3:		lea 0x8(%di),%di		# Next entry
		loop init.2			# Till set done
		jmp init.1			# Continue

Each entry is 8 bytes long. contains

  • segment/offset information
  • segment type
  • privilege level
  • whether the segment is present in memory or not.

Interrupt numbers:

  • 0x0 to 0xf (exceptions) handled by function intx00
  • 0x10 (also an exception) handled by intx10
  • 0x20 - 0x2f (hardware interrupts), handled by intx20
  • 0x30 (system calls), handled by intx30
  • 0x31 - 0x32, handled by intx31

Note:

  • Only interrupt vectors 0x30, 0x31, 0x32 are given privilege level 3, same as boot2 client. Thus user mode client can use services provided by BTX.
  • Hardware interrupts and processor exceptions are always handled regardless of privileges involved.

Next is to initialize TSS:

/* stand/i386/btx/btx/btx.S 
*/

/*
 * Initialize TSS.
 */
init.4:		
        movb $_ESP0H,TSS_ESP0+1(%di)	# Set ESP0
		movb $SEL_SDATA,TSS_SS0(%di)	# Set SS0
		movb $_TSSIO,TSS_MAP(%di)	# Set I/O bit map base

A hardcode value is given to stack pointer and stack segment for privilege level 0 in the TSS.

A value is also given to the I/O Map base address field of the TSS.

Next is the allow the processor to switch to protected mode:

/* stand/i386/btx/btx/btx.S 
*/

/*
 * Bring up the system.
 */
		mov $0x2820,%bx			# Set protected mode
		callw setpic			#  IRQ offsets
		lidt idtdesc			# Set IDT
		lgdt gdtdesc			# Set GDT
		mov %cr0,%eax			# Switch to protected
		inc %ax				#  mode
		mov %eax,%cr0			#
		ljmp $SEL_SCODE,$init.8		# To 32-bit code
		.code32
init.8:		xorl %ecx,%ecx			# Zero
		movb $SEL_SDATA,%cl		# To 32-bit
		movw %cx,%ss			#  stack

locore.S

Reference 1

sys/mips/mips/locore.S

CHERI specific

  • create CHERI kernel sealing cap, store as kernel_sealcap;
  • create a universal user cap covering all userspace, store as userspace_cap;
  • create CHERI user sealing cap, user_sealcap;
  • swap cap swap_restore_cap
  • more at cheribsd booting

    GLOBAL(btext)
    ASM_ENTRY(_start)
    VECTOR(_locore, unknown)
    ...
    /*
        * Initialize stack and call machine startup.
        */
    PTR_LA		sp, _C_LABEL(pcpu_space)
    PTR_ADDU	sp, (PAGE_SIZE * 2) - CALLFRAME_SIZ
    
    REG_S	zero, CALLFRAME_RA(sp)	# Zero out old ra for debugger
    REG_S	zero, CALLFRAME_SP(sp)	# Zero out old fp for debugger
    
    PTR_LA	gp, _C_LABEL(_gp)
    
    /* Call the platform-specific startup code. */
    PTR_LA	t9, _C_LABEL(platform_start)
    jalr	t9
    nop
    
    
    	PTR_LA	sp, _C_LABEL(thread0_st)
    	PTR_L	a0, TD_PCB(sp)
    	REG_LI	t0, ~7
    	and	a0, a0, t0
    	PTR_SUBU	sp, a0, CALLFRAME_SIZ
    
    	PTR_LA	t9, _C_LABEL(mi_startup)
    	jalr	t9				# mi_startup(frame)
    	sw	zero, (CALLFRAME_SIZ - 8)(sp)	# Zero out old fp for debugger
    
    	PANIC("Startup failed!")

platform_start

stack pointer: pcpu_space + (page size * 2) - callframe_size

mi_startup


  1. Cheribsd github ↩
Created Oct 30, 2019 // Last Updated Apr 30, 2020

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?