John Tsiombikas nuclear@mutantstargoat.com
5 January 2024
This is more of a "notes for future reference" post, than anything else. If you get to the end of it and ask yourself: well what's your point? That's why, I don't necessarily have a point to make...
I recently decided to revisit megadrive hacking. As a Debian user, and most
importantly a lazy person, my immediate automatic response to the need for a
toolchain for it was: apt-get install gcc-m68k-linux-gnu
. And that works
perfectly fine for the most part (and is delightfully lazy, so that's a plus).
The fact that it's meant to build 68k GNU/Linux programs, and not flat binaries
for the megadrive ROM cartridge is no problem; just make the appropriate linker
script and pass the correct arguments to gcc
and ld
, and it works... until
it doesn't.
For the record here's my linker script for megadrive projects. I intend to write an article at some point, explaining how linker scripts work and how awesome they are.
OUTPUT_ARCH(m68k)
MEMORY
{
rom : ORIGIN = 0x00000000, LENGTH = 0x00a00000
ram : ORIGIN = 0x00ff0000, LENGTH = 0x00010000
}
PROVIDE (_stacktop = 0x01000000);
SECTIONS {
/* ---- start of ROM ---- */
/* .vect section is used to place the m68k exception vectors at the
* beginning of the address space
*/
.vect : { * (.vect); } >rom
/* .romhdr section is used to place the SEGA ROM header at 0x100 */
. = 0x100;
.romhdr : { * (.romhdr); } >rom
.text : { * (.text); } >rom
.rodata : { * (.rodata); } >rom
/* place the load address of the .data section after .rodata */
. = ALIGN(4);
_data_lma = .;
_rom_end = _data_lma + _data_size;
/* ---- start of RAM ---- */
. = 0xff0000;
/* place the .data section at the start of RAM */
.data ALIGN(4): AT (_data_lma) {
_data_start = .;
* (.data);
. = ALIGN(4);
_data_end = .;
} >ram
_data_size = SIZEOF(.data);
/* place the .bss section at the end */
.bss ALIGN(4): {
_bss_start = .;
* (.bss);
* (COMMON);
. = ALIGN(4);
_bss_end = .;
} >ram
_bss_size = SIZEOF(.bss);
}
Everything was going smoothly, until I added my printf implementation to the
source code, for debug output, and used it to print a number. My printf had the
audacity to divide by 10 (and also modulo 10), which immediately led to a hang.
If I printed in hex all was well, but the first appearance of a "%d"
just
stopped the megadrive in its tracks. You must understand that dividing on the
68000 is to be avoided whenever possible, and there were no other divisions in
any of my megadrive hacks, so it took some time to figure out what was going on.
The hang was obviously due to an exception, since all my unused interrupt
vectors point to a simple stop #0x2700
(disable interrupts and stop), but
which interrupt and why? It can't be a division by 0, since that code divides by
a hardcoded 10. Changing the interrupt handler to print some information
about the exception before dying, is most illuminating:
Clearly 227d is not a reasonable address to appear in the program counter of a processor which executes 16-bit aligned instructions. Let's take a look at the disassembly to see how we end up there...
00002270 <__modsi3>:
...
227c: 61ff bsrs 227d <__modsi3+0xd>
227e: ffff .short 0xffff
2280: ff92 .short 0xff92
2282: 508f addql #8,%sp
...
__modsi3
is a function in libgcc
, used to compute the modulo, it
makes sense for the issue to be there, since the exception started appearing
when I first started dividing and computing modulo by 10.
But 61ff at address 227c is the first part of a branch to subroutine
instruction with a 32bit relative offset, not a short bsr
by -1 ending up at
227d! Unfortunately, the 68000 doesn't have a bsr
with a 32bit relative
offset; only 8 or 16bit. The instruction we see here is available on the 68020
or later processors.
In retrospect it makes sense that the m68k-linux-gnu
toolchain in debian would
be compiled for later motorola processors, since linux needs an MMU.
There are two ways to overcome this problem. The laziest, which obviously I went for first, is to just drop the source code of these support functions from libgcc into my project, and include them in my build, instead of linking libgcc; that works perfectly fine. Later however I went looking for a 68k-compatible gdb, which unfortunately debian does not provide, and I thought if I'm going to have to build gdb myself for the 68000, I might as well take the opportunity and build the whole toolchain correctly and be done with it.
To obtain a complete 68000 toolchain, we need to build binutils
, and gcc
for
the target architecture. And let's not forget gdb
as well, which is often
useful.
Let's start by downloading the latest release of binutils, which in my case was 2.41:
wget https://ftp.gnu.org/gnu/binutils/binutils-2.41.tar.gz
tar xzvf binutils-2.41.tar.gz
The best way to build these packages is out-of-source, so we can use the same
source tree to build for other targets as well if need arises. We'll pass
--target=m68k-elf
to configure to specify the target architecture, along with
--program-prefix=m68k-elf-
so that the assembler for instance would be
installed as m68k-elf-as
, and --disable-nls
because we don't care for
internationalization-induced bloat.
mkdir build-binutils && cd build-binutils
../binutils-2.41/configure --program-prefix=m68k-elf- --target=m68k-elf --disable-nls
make -j4
sudo make install
Similarly with GCC we'll download and extract the latest release:
wget https://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.gz
tar xzvf gcc-13.2.0.tar.gz
And during configure this time I'll add --enable-languages=c
because I don't
want any other front-ends (if you want a C++ compiler you should use c,c++
here), and --disable-libssp
which is some kind of stack protection code which
we don't need, we don't want, and would also need further dependencies we can't
have on the megadrive. Also we'll have to pass CFLAGS_FOR_TARGET=-m68000
to
the build, to make sure anything compiled for the target machine (like libgcc)
will only use 68000 instructions.
mkdir build-gcc && cd build-gcc
../gcc-13.2.0/configure --target=m68k-elf --program-prefix=m68k-elf- --enable-languages=c \
--disable-libssp --disable-nls
make -j4 CFLAGS_FOR_TARGET=-m68000
sudo make install
And now if we build again and take a look at the disassembly of our program, the 32bit relative branch to subroutine, has been replaced by an absolute jump to subroutine, which is perfectly legal on a 68000:
2288: 4eb9 0000 21f0 jsr 21f0 <__udivsi3>
Same routine:
wget https://ftp.gnu.org/gnu/gdb/gdb-14.1.tar.gz
tar xzvf gdb-14.1.tar.gz
And then:
mkdir build-gdb && cd build-gdb
../gdb-14.1/configure --program-prefix=m68k-elf- --target=m68k-elf --disable-nls
make -j4
sudo make install
Here's a final note about using gdb to debug megadrive programs.
Ideally one day I'd like to add a gdb stub in my megadrive code, and cobble together a serial communication interface between the megadrive and my PC, to debug while running on the actual machine. But again, I'm lazy, so I haven't bothered yet.
Thankfully the developers of the blastem emulator, were kind enough to implement a gdb server built into the emulator itself. To use it:
add a .gdbinit
file to the project root, containing the line:
target remote | blastem foo.bin -D
(assuming foo.bin
is the flat ROM image of our program).
Run the 68k debugger, passing the ELF binary which contains debug symbols as an argument:
m68k-elf-gdb foo.elf
.
When we do that we're greeted by a stopped emulator and the debugger waiting for us to set breakpoints and/or hit 'c' to continue execution.
$ m68k-elf-gdb -q foo.elf
Reading symbols from foo.elf...
0x000002d0 in start ()
(gdb) b main
Breakpoint 1 at 0x80c: file src/main.c, line 17.
(gdb) c
Continuing.
Breakpoint 1, main () at src/main.c:17
17 z80_init();
(gdb) s
z80_init () at src/z80.c:11
11 z80_grab_bus();
... and so on ...
Nice, huh?
Greets to the blastem developers for enabling this!