by John Tsiombikas
Last update: 19 April 2018.
So, I want to write something, that will make it easier to make bootable PC demos and/or games. I want to just bring up the computer and run my code. No middle-men. No operating systems. Where to start?
I'm targeting the IBM PC compatibles (is this term used any more?) so let me touch briefly on how a PC starts.
Like pretty much all computers, there's a ROM chip sitting on the bus, handling a certain part of the memory address space, which just so happens to include the address from where the processor is designed to start executing instructions when it's powered up, or after a reset. This ROM chip holds a program starting at that address, which initializes what needs initializing, and then looks for something else to run that will (eventually) bring up the operating system.
So far I might have been talking about any computer system, so let's make it more specific. In the case of the IBM PC, the program which resides in ROM is called the BIOS. The processor is an intel x86 or compatible. The reset vector is at linear address ffff0, at the top of the original 8086/8088 1MB address space. And the BIOS, after it finishes initializing the system, looks for a valid boot sector to load, in a number of storage devices, and in an order usually configurable through a BIOS configuration menu.
The boot sector is always the very first sector (512-byte block) of a storage device, and in order for the BIOS to consider it valid, its last 2 bytes must be: 55 aa. When the BIOS finds a valid boot sector, it proceeds to load it into memory at address 7c00, and execute a jump to that address to start executing what is presumably code that will start up the operating system. That code is called a "boot loader", and it's the first thing we'll need to write, if we're serious about bare metal hacking on this thing.
If it's not immediately apparent, let me point out that 510 bytes is way too small to fit any particularly useful operating system, or in this case: bootable demo or game. So the piece of code I'll put into the boot sector must load some more sectors off the original boot storage device (let's call it disk from now on for brevity), with the rest of my code and jump to it; hence the name: boot loader.
Interestingly, it's way more complicated than that... There are a number of obstacles that makes it nigh impossible to fit even a reasonable boot loader in the boot sector.
The processor starts up in a mode emulating its aforementioned 16-bit forerunners, called "real mode", and can access a mere 1MB of memory, in a horrific segmented memory model. In this lovecraftian fever dream, all memory accesses are done by combining the value of a 16bit segment register, with a 16 bit offset, to produce the actual 20bit address that will go out to the bus. Specifically, the 16bits of the segment register are shifted 4 bits to the left, and added to the 16 bit offset.
As I don't want to constrain my programs to fit in 1MB, and certainly don't intend to write my programs in this grotesque memory model, the boot loader will need to switch the processor to 32bit protected mode, before it loads the rest of my program into memory. In fact I intend to load my whole main program starting at the 1MB mark, and it's impossible to do that from real mode.
So I'll make what is known as a two-stage boot loader. The first stage which is loaded by BIOS and must fit in 510 bytes will be a very simple real-mode program which loads the rest of the boot loader (which can, and will be much larger) and jumps to it. The second stage then will switch to 32bit protected mode, load the whole main program without any size limitations, and start it already in a sane 32bit execution environment with a linear memory model.
The next question that needs to be answered is how to load the second stage from disk? This one turns out to be simple enough, because BIOS conveniently provides a number of services, one of which is reading a bunch of sectors from disks into a memory buffer. Before it gives us control the BIOS has hooked the interrupt vector 13h, which we can call by issuing the int instruction (software interrupt). When we do, the interrupt handler in the BIOS takes control, checks the value of the ah register where it expects to find the number of the operation we want, and performs that operation.
The BIOS call number for reading sectors off a disk is 2. That call expects a number of arguments in other registers. Specfically it expects the number of sectors to read in al, the cylinder number in the highest 10 bits of cx, the head in dh, the starting sector within the track defined by cylinder and head in the lowest 6 bits of cx, the device number in dl, and finally the destination pointer in es:bx (segment:offset). The sectors loaded by a single call must all be within a single track on the disk, and within the same 64k segment in the destination.
BIOS calls are only callable from real mode. After I switch to protected mode in the second stage boot loader, I'll have to jump through a few more hoops (virtual 8086 mode) to keep using them to load the main program. The reason I want to use the BIOS for reading even after the switch to protected mode, instead of writing a fully 32bit driver, is that I want to allow booting from any device supported by the BIOS (floppy, USB stick, CDROM, etc), and I'd rather not write drivers for USB in particular, especially not in the boot loader which I intend to keep as simple as possible.
Here's a video of the first test. It loads a dummy second stage which just draws something on screen to help me completely debug the first stage loader, before moving on to write the actual second stage loader.
I also managed to fit some code to print text and numbers, both to the screen, and the serial port, which helped a lot in debugging.
One bug I had initially was the failure to load correctly when booting from USB stick instead of a floppy. The reason, predictably was due to using hardcoded floppy parameters for the sector linear to CHS (cylinder/head/sector) address translation, while the BIOS would emulate a large disk with arbitrary CHS geometry when booting from the USB stick.
Another bug was that I was setting only half of the palette to the VGA DAC in the test program, because I used a jno (jump if not overflow), instead of jnc (jump if not carry) instruction to control the loop.
Other than these minor setbacks, the first stage loader worked without much fuss. Here is the code of the first stage loader: boot.s.
The full source code can be found at the pcboot github page. The code corresponding to this stage of the project as described in this article is tagged as test1_rm.