Recently I came across an old recovery disk for my parents’ old 486 from 1993 (the first machine I really seriously programmed — the Tandy before it got some QBasic time, but nothing as crazy as a C compiler. Some real mode assembly though once or twice). I managed to cobble things together to get it booting in VirtualBox (you see, in those days booting from CD didn’t exist, so the CD itself isn’t bootable, it just has a bunch of zips that were written to floppies by an operational system — getting that up and running is a post for another day, but it required hexediting the virtual harddrive to lay down IO.SYS and MSDOS.SYS properly in FAT16, whee).
Anyway, after it was up and running, I figured it might be fun to start tearing it apart (since the entire machine state fits in RAM (4MB RAM, 512MB harddrive, 527MB restore CD, a couple 1.4MB boot floppies) it’s easy to blow it away and restore it very rapidly.
So, let’s start from the beginning, the Master Boot Record. This the MBR laid down by DOS 6. Before this there’s the BIOS, but unfortunately I don’t have access to that (otherwise, it’d be a fun but much more complex piece of software to explore), and the MBR’s really tiny (just 512 bytes, of which only like 446 or so are executable code).
Here’s the Hex for a DOS 6 MBR on a ~512MB harddrive with just 1 “primary DOS partition”.
FA33C08ED0BC007C8BF45007501FFBFCBF0006B90001F2A5EA1D060000BEBE07 B304803C80740E803C00751C83C610FECB75EFCD188B148B4C028BEE83C610FE CB741A803C0074F4BE8B06AC3C00740B56BB0700B40ECD105EEBF0EBFEBF0500 BB007CB8010257CD135F730C33C0CD134F75EDBEA306EBD3BEC206BFFE7D813D 55AA75C78BF5EA007C0000496E76616C696420706172746974696F6E20746162 6C65004572726F72206C6F6164696E67206F7065726174696E67207379737465 6D004D697373696E67206F7065726174696E672073797374656D000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000008001 0100061FBF073F000000C1FE0F00000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000055AA
55AA at the end indicates a “valid” boot sector (sort of). The long stream of zeros mid-way through starts the unused bit, and the stuff just before the end is the Partition Table (we’ll ignore that for now — look it up on wikipedia if you’re curious).
Let’s take a look at the non-zero bits at the beginning:
FA33C08ED0BC007C8BF45007501FFBFCBF0006B90001F2A5EA1D060000BEBE07 B304803C80740E803C00751C83C610FECB75EFCD188B148B4C028BEE83C610FE CB741A803C0074F4BE8B06AC3C00740B56BB0700B40ECD105EEBF0EBFEBF0500 BB007CB8010257CD135F730C33C0CD134F75EDBEA306EBD3BEC206BFFE7D813D 55AA75C78BF5EA007C0000496E76616C696420706172746974696F6E20746162 6C65004572726F72206C6F6164696E67206F7065726174696E67207379737465 6D004D697373696E67206F7065726174696E672073797374656D00
(1 trailing zero byte left since it’s the end of a string which might be null terminated – we’ll see that next).
Let’s start by looking for strings.
$ hexdump -C dos6.mbr 00000000 fa 33 c0 8e d0 bc 00 7c 8b f4 50 07 50 1f fb fc |.3.....|..P.P...| 00000010 bf 00 06 b9 00 01 f2 a5 ea 1d 06 00 00 be be 07 |................| 00000020 b3 04 80 3c 80 74 0e 80 3c 00 75 1c 83 c6 10 fe |...<.t..<.u.....| 00000030 cb 75 ef cd 18 8b 14 8b 4c 02 8b ee 83 c6 10 fe |.u......L.......| 00000040 cb 74 1a 80 3c 00 74 f4 be 8b 06 ac 3c 00 74 0b |.t..<.t.....<.t.| 00000050 56 bb 07 00 b4 0e cd 10 5e eb f0 eb fe bf 05 00 |V.......^.......| 00000060 bb 00 7c b8 01 02 57 cd 13 5f 73 0c 33 c0 cd 13 |..|...W.._s.3...| 00000070 4f 75 ed be a3 06 eb d3 be c2 06 bf fe 7d 81 3d |Ou...........}.=| 00000080 55 aa 75 c7 8b f5 ea 00 7c 00 00 49 6e 76 61 6c |U.u.....|..Inval| 00000090 69 64 20 70 61 72 74 69 74 69 6f 6e 20 74 61 62 |id partition tab| 000000a0 6c 65 00 45 72 72 6f 72 20 6c 6f 61 64 69 6e 67 |le.Error loading| 000000b0 20 6f 70 65 72 61 74 69 6e 67 20 73 79 73 74 65 | operating syste| 000000c0 6d 00 4d 69 73 73 69 6e 67 20 6f 70 65 72 61 74 |m.Missing operat| 000000d0 69 6e 67 20 73 79 73 74 65 6d 00 00 00 00 00 00 |ing system......| 000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01 |................| 000001c0 01 00 06 1f bf 07 3f 00 00 00 c1 fe 0f 00 00 00 |......?.........| 000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
So we see a couple strings in there: “Invalid partition table” and “Error loading operating system” and “Missing operating system”. Those three cover about a third of the bytes. Now let’s look at the disassembly.
For reference, the first string starts at 0x8C (140) bytes, and after the strings it looks like it’s just zeros until the partition table, so we’ll focus our disassembly efforts on the first 0x8B (139) bytes. Also, the BIOS loads the MBR at 0x7C00, so we’ll tell the disassembler to pretend like the code is at that location in memory. It also starts in Real Mode (1MB addressable, segmented, no virtual memory, dark ages stuff). Here we go.
$ ndisasm -a -o 0x7c00 -k 0x7c8B,0x200 dos6.mbr 00007C00 FA cli ; clear interrupt flag (disable interrupts) 00007C01 33C0 xor ax,ax ; ax = 0 00007C03 8ED0 mov ss,ax ; ss = ax (0) (this is our stack "segment") 00007C05 BC007C mov sp,0x7c00 ; sp = 0x7c00 (this is our stack - remember, it grows down) 00007C08 8BF4 mov si,sp ; si = sp (0x7c00) 00007C0A 50 push ax ; push ax (0) on the stack 00007C0B 07 pop es ; pop es (this sets it to 0 because that's the last thing we pushed) 00007C0C 50 push ax ; push ax (0) (again) 00007C0D 1F pop ds ; pop ds (same as es above) 00007C0E FB sti ; set interrupt flag (enable interrupts) 00007C0F FC cld ; clear direction flag (string operations work forward) 00007C10 BF0006 mov di,0x600 ; di = 0x600 (1536, or 1.5k) 00007C13 B90001 mov cx,0x100 ; cx = 0x100 (256) 00007C16 F2A5 repne movsw ; memcpy si (0x7c00 above) to di, cx "words" (16 bit values). 00007C18 EA1D060000 jmp 0x0:0x61d ; jump to 0x61d (cs = 0) 00007C1D BEBE07 mov si,0x7be ; record scratching ...
Ok, so just a few instructions in, we’ve got some basic but useful stuff going on. Initially it zeros out all the segment registers (ss, es, ds, and cs) because the BIOS could leave them weird, it sets up the stack at 0x7c00 (just under the MBR), and then it copies 256 words (512 bytes) from 0x7c00 to 0x600 (this copies the MBR from 0x7c00 to 0x600). Then we jump to 0x61d (notice how the next instruction is at address 0x7C1D — that’s 0x1D bytes after the beginning of the MBR. 0x61d is 0x1d bytes beyond the start of the copied MBR, so basically that jump jumps to the next instruction, but at a different location in memory). It’s common for MBRs to copy themselves elsewhere, and then load stuff at 0x7c00 (this is why they move themselves out of the way). So now let’s disassemble it again, but with the assembler thinking it’s at 0x600 (since that’s where the remainder of the code will be executing).
This time, there’s a lot of flow control, so I’ve interjected a lot of commentary in-line
$ ndisasm -a -o 0x600 -k 0x68B,0x200 dos6.mbr [0x7c00 relocation code omitted for brevity] 00000618 EA1D060000 jmp 0x0:0x61d ; (the jump to 0x61d from before) 0000061D BEBE07 mov si,0x7be ; (this is where we'd continue operating) si = 0x7be 00000620 B304 mov bl,0x4 ; bl = 4 (maximum number of partitions) ; check partition 00000622 803C80 cmp byte [si],0x80 ; compare the byte at offset si with 0x80 ; 0x7BE is 0x1BE bytes from the beginning of our relocated MBR -- our MBR is 0x200 bytes, so this is ; inside the MBR, as opposed to some random memory location. 0x1BE is the first byte of the first ; partition of our drive. The first byte of the partition table is the status byte -- 0x00 means ; inactive, 0x80 means active. 00000625 740E jz 0x635 ; if equal (first partition is active), jump to 0x635 ; first partition wasn't active 00000627 803C00 cmp byte [si],0x0 ; is the partition status inactive? 0000062A 751C jnz 0x648 ; it's not - it's an invalid value. jump to 0x648 0000062C 83C610 add si,byte +0x10 ; advance to next partition structure (16 bytes) 0000062F FECB dec bl ; decrement bl (partiton count) 00000631 75EF jnz 0x622 ; as long as bl isn't 0 jump to check partition above 00000633 CD18 int 0x18 ; start BASIC interpreter, or reboot, or whatever ; first partition is active 00000635 8B14 mov dx,[si] ; dx = partition status byte 00000637 8B4C02 mov cx,[si+0x2] ; cx = partiton sector and cylinder start 0000063A 8BEE mov bp,si ; bp = start of partition structure ; check remaining partitions 0000063C 83C610 add si,byte +0x10 ; advance si by 16 (point to next partition structure) 0000063F FECB dec bl ; decrement bl (partition count) 00000641 741A jz 0x65d ; jump to 0x65d once we've exhausted all partitions 00000643 803C00 cmp byte [si],0x0 ; is this partition not marked inactive? 00000646 74F4 jz 0x63c ; if it is, jump above to check next partition ; if one of the remaining partitions is marked active or otherwise not set to 0 (inactive) ; we fall through to log an error message ; invalid partition status byte 00000648 BE8B06 mov si,0x68b ; si = offset of "Invalid partition table" ; top of string printing loop 0000064B AC lodsb ; load a string byte to al 0000064C 3C00 cmp al,0x0 ; end of string? 0000064E 740B jz 0x65b ; if so, jump to infinite loop 00000650 56 push si ; save si 00000651 BB0700 mov bx,0x7 ; color 0x7 (grey text, black background) 00000654 B40E mov ah,0xe ; print character function (0xE) 00000656 CD10 int 0x10 ; INT 0x10, Function 0xE , color 0x7 00000658 5E pop si ; restore si 00000659 EBF0 jmp short 0x64b ; jump to top of string printing loop ; infinite loop 0000065B EBFE jmp short 0x65b ; partition table is well-formed (exactly 1 active partition) 0000065D BF0500 mov di,0x5 ; di = 5 (retries) ; read start of partition 00000660 BB007C mov bx,0x7c00 ; destination = 0x7c00 00000663 B80102 mov ax,0x201 ; read sectors (0x200), 1 sector (0x1) 00000666 57 push di ; save di 00000667 CD13 int 0x13 ; read sectors (AH = 0x2) 00000669 5F pop di ; restore di 0000066A 730C jnc 0x678 ; on no error jump to final checks 0000066C 33C0 xor ax,ax ; ax = 0 0000066E CD13 int 0x13 ; reset disks 00000670 4F dec di ; decrement tries 00000671 75ED jnz 0x660 ; try again if there are tries left ; exhausted all retries 00000673 BEA306 mov si,0x6a3 ; "Error loading operating system" string 00000676 EBD3 jmp short 0x64b ; jump to string printing loop ; final checks 00000678 BEC206 mov si,0x6c2 ; "Missing operating system" string 0000067B BFFE7D mov di,0x7dfe ; address of last 0000067E 813D55AA cmp word [di],0xaa55 ; is the last word the partition signature? 00000682 75C7 jnz 0x64b ; if not, jump to string printing loop 00000684 8BF5 mov si,bp ; 00000686 EA007C0000 jmp 0x0:0x7c00 ; jump to loaded bytes!
So there we have it. 139 bytes of code, and it relocates itself to 0x600, checks the partition table, if it’s good it loads the first sector of the active partition to 0x7c00, and then jumps to it, otherwise it prints error messages. It has some retries and some sanity checks, but otherwise it’s pretty straightforward. Next time we’ll check out the sector it loads and see what that does.