SEGA megadrive toolchain setup

John Tsiombikas nuclear@mutantstargoat.com

5 January 2024

This is more of a "notes for future reference" post, than anything else. If you get to the end of it and ask yourself: well what's your point? That's why, I don't necessarily have a point to make...

I recently decided to revisit megadrive hacking. As a Debian user, and most importantly a lazy person, my immediate automatic response to the need for a toolchain for it was: apt-get install gcc-m68k-linux-gnu. And that works perfectly fine for the most part (and is delightfully lazy, so that's a plus). The fact that it's meant to build 68k GNU/Linux programs, and not flat binaries for the megadrive ROM cartridge is no problem; just make the appropriate linker script and pass the correct arguments to gcc and ld, and it works... until it doesn't.

For the record here's my linker script for megadrive projects. I intend to write an article at some point, explaining how linker scripts work and how awesome they are.

OUTPUT_ARCH(m68k)

MEMORY
{
    rom : ORIGIN = 0x00000000, LENGTH = 0x00a00000
    ram : ORIGIN = 0x00ff0000, LENGTH = 0x00010000
}

PROVIDE (_stacktop = 0x01000000);

SECTIONS {
    /* ---- start of ROM ---- */
    /* .vect section is used to place the m68k exception vectors at the
     * beginning of the address space
     */
    .vect : { * (.vect); } >rom
    /* .romhdr section is used to place the SEGA ROM header at 0x100 */
    . = 0x100;
    .romhdr : { * (.romhdr); } >rom
    .text : { * (.text); } >rom
    .rodata : { * (.rodata); } >rom

    /* place the load address of the .data section after .rodata */
    . = ALIGN(4);
    _data_lma = .;
    _rom_end = _data_lma + _data_size;

    /* ---- start of RAM ---- */
    . = 0xff0000;
    /* place the .data section at the start of RAM */
    .data ALIGN(4): AT (_data_lma) {
        _data_start = .;
        * (.data);
        . = ALIGN(4);
        _data_end = .;
    } >ram
    _data_size = SIZEOF(.data);

    /* place the .bss section at the end */
    .bss ALIGN(4): {
        _bss_start = .;
        * (.bss);
        * (COMMON);
        . = ALIGN(4);
        _bss_end = .;
    } >ram
    _bss_size = SIZEOF(.bss);
}

Divide and conquer crash

Everything was going smoothly, until I added my printf implementation to the source code, for debug output, and used it to print a number. My printf had the audacity to divide by 10 (and also modulo 10), which immediately led to a hang. If I printed in hex all was well, but the first appearance of a "%d" just stopped the megadrive in its tracks. You must understand that dividing on the 68000 is to be avoided whenever possible, and there were no other divisions in any of my megadrive hacks, so it took some time to figure out what was going on.

The hang was obviously due to an exception, since all my unused interrupt vectors point to a simple stop #0x2700 (disable interrupts and stop), but which interrupt and why? It can't be a division by 0, since that code divides by a hardcoded 10. Changing the interrupt handler to print some information about the exception before dying, is most illuminating:

panic

Clearly 227d is not a reasonable address to appear in the program counter of a processor which executes 16-bit aligned instructions. Let's take a look at the disassembly to see how we end up there...

00002270 <__modsi3>:
    ...
    227c:    61ff          bsrs 227d <__modsi3+0xd>
    227e:    ffff          .short 0xffff
    2280:    ff92          .short 0xff92
    2282:    508f          addql #8,%sp
    ...

__modsi3 is a function in libgcc, used to compute the modulo, it makes sense for the issue to be there, since the exception started appearing when I first started dividing and computing modulo by 10.

But 61ff at address 227c is the first part of a branch to subroutine instruction with a 32bit relative offset, not a short bsr by -1 ending up at 227d! Unfortunately, the 68000 doesn't have a bsr with a 32bit relative offset; only 8 or 16bit. The instruction we see here is available on the 68020 or later processors.

In retrospect it makes sense that the m68k-linux-gnu toolchain in debian would be compiled for later motorola processors, since linux needs an MMU.

There are two ways to overcome this problem. The laziest, which obviously I went for first, is to just drop the source code of these support functions from libgcc into my project, and include them in my build, instead of linking libgcc; that works perfectly fine. Later however I went looking for a 68k-compatible gdb, which unfortunately debian does not provide, and I thought if I'm going to have to build gdb myself for the 68000, I might as well take the opportunity and build the whole toolchain correctly and be done with it.

Building a 68000 toolchain

To obtain a complete 68000 toolchain, we need to build binutils, and gcc for the target architecture. And let's not forget gdb as well, which is often useful.

Build binutils

Let's start by downloading the latest release of binutils, which in my case was 2.41:

wget https://ftp.gnu.org/gnu/binutils/binutils-2.41.tar.gz
tar xzvf binutils-2.41.tar.gz

The best way to build these packages is out-of-source, so we can use the same source tree to build for other targets as well if need arises. We'll pass --target=m68k-elf to configure to specify the target architecture, along with --program-prefix=m68k-elf- so that the assembler for instance would be installed as m68k-elf-as, and --disable-nls because we don't care for internationalization-induced bloat.

mkdir build-binutils && cd build-binutils
../binutils-2.41/configure --program-prefix=m68k-elf- --target=m68k-elf --disable-nls
make -j4
sudo make install

Build GCC

Similarly with GCC we'll download and extract the latest release:

wget https://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.gz
tar xzvf gcc-13.2.0.tar.gz

And during configure this time I'll add --enable-languages=c because I don't want any other front-ends (if you want a C++ compiler you should use c,c++ here), and --disable-libssp which is some kind of stack protection code which we don't need, we don't want, and would also need further dependencies we can't have on the megadrive. Also we'll have to pass CFLAGS_FOR_TARGET=-m68000 to the build, to make sure anything compiled for the target machine (like libgcc) will only use 68000 instructions.

mkdir build-gcc && cd build-gcc
../gcc-13.2.0/configure --target=m68k-elf --program-prefix=m68k-elf- --enable-languages=c \
        --disable-libssp --disable-nls
make -j4 CFLAGS_FOR_TARGET=-m68000
sudo make install

And now if we build again and take a look at the disassembly of our program, the 32bit relative branch to subroutine, has been replaced by an absolute jump to subroutine, which is perfectly legal on a 68000:

2288:   4eb9 0000 21f0  jsr 21f0 <__udivsi3>

Build the debugger

Same routine:

wget https://ftp.gnu.org/gnu/gdb/gdb-14.1.tar.gz
tar xzvf gdb-14.1.tar.gz

And then:

mkdir build-gdb && cd build-gdb
../gdb-14.1/configure --program-prefix=m68k-elf- --target=m68k-elf --disable-nls
make -j4
sudo make install

Using the debugger

Here's a final note about using gdb to debug megadrive programs.

Ideally one day I'd like to add a gdb stub in my megadrive code, and cobble together a serial communication interface between the megadrive and my PC, to debug while running on the actual machine. But again, I'm lazy, so I haven't bothered yet.

Thankfully the developers of the blastem emulator, were kind enough to implement a gdb server built into the emulator itself. To use it:

When we do that we're greeted by a stopped emulator and the debugger waiting for us to set breakpoints and/or hit 'c' to continue execution.

$ m68k-elf-gdb -q foo.elf 
Reading symbols from foo.elf...
0x000002d0 in start ()
(gdb) b main
Breakpoint 1 at 0x80c: file src/main.c, line 17.
(gdb) c
Continuing.

Breakpoint 1, main () at src/main.c:17
17              z80_init();
(gdb) s
z80_init () at src/z80.c:11
11              z80_grab_bus();
... and so on ...

Nice, huh?
Greets to the blastem developers for enabling this!


Discuss this post

Back to my blog