# HG changeset patch # User John Tsiombikas # Date 1397106232 -10800 # Node ID 235c8b764c0b1e314ba8fda82e6086394104ee3f # Parent 70e332156d0267d4261da69943dc928dedf40d9a optimized swap_buffers diff -r 70e332156d02 -r 235c8b764c0b Makefile --- a/Makefile Thu Apr 10 02:31:31 2014 +0300 +++ b/Makefile Thu Apr 10 08:03:52 2014 +0300 @@ -1,4 +1,4 @@ -baseobj = main.obj logger.obj screen.obj scrman.obj +baseobj = main.obj logger.obj screen.obj scrman.obj swapbuf.obj modelobj = modeller.obj min3d.obj m3drast.obj lines.obj rendobj = renderer.obj vmath.obj scnobj = scene.obj object.obj @@ -6,11 +6,14 @@ obj = $(baseobj) $(modelobj) $(rendobj) $(scnobj) $(sysobj) bin = rayzor.exe -#dbg = -d2 +opt = -5 -fp5 -otexan +dbg = -d1 +AS = nasm CC = wcc386 CXX = wpp386 -CFLAGS = $(dbg) -5 -fp5 -otexan -zq -bt=dos -Isrc\stl +ASFLAGS = -fobj +CFLAGS = $(dbg) $(opt) -zq -bt=dos -Isrc\stl CXXFLAGS = $(CFLAGS) LD = wlink @@ -18,8 +21,9 @@ %write objects.lnk file { $(obj) } $(LD) debug all name $@ @objects $(LDFLAGS) -.c: src\ -.cc: src\ +.c: src +.cc: src +.asm: src .c.obj: .autodepend $(CC) $(CFLAGS) $[* @@ -27,6 +31,9 @@ .cc.obj: .autodepend $(CXX) $(CXXFLAGS) $[* +.asm.obj: + $(AS) $(ASFLAGS) -o $@ $[*.asm + clean: .symbolic del *.obj del $(bin) diff -r 70e332156d02 -r 235c8b764c0b src/main.cc --- a/src/main.cc Thu Apr 10 02:31:31 2014 +0300 +++ b/src/main.cc Thu Apr 10 08:03:52 2014 +0300 @@ -1,7 +1,6 @@ #include #include #include -#include #include #include "inttypes.h" #include "gfx.h" @@ -14,6 +13,16 @@ #include "modeller.h" #include "renderer.h" #include "scrman.h" +#include "timer.h" + +#ifdef __DOS__ +#undef USE_ASM_SWAPBUF +#endif + +#ifdef USE_ASM_SWAPBUF +// defined in swapbuf.asm +extern "C" void swap_buffers_asm(void *dest, void *src, int xsz, int ysz, int bpp); +#endif static bool init(); static void cleanup(); @@ -31,19 +40,22 @@ int fb_bpp = 32; Scene *scene; -static int bytespp; static bool novideo; static void *fb; static int rbits, gbits, bbits; static int rshift, gshift, bshift; static unsigned int rmask, gmask, bmask; +static bool use_asm_swap = true; static bool use_mouse; static int mouse_x, mouse_y; static bool quit; int main(int argc, char **argv) { + unsigned long start_msec, msec; + unsigned long nframes = 0; + if(!parse_args(argc, argv)) { return 1; } @@ -51,6 +63,8 @@ return 1; } + start_msec = get_msec(); + // main loop for(;;) { handle_keyboard(); @@ -58,11 +72,16 @@ if(quit) break; display(); + ++nframes; if(novideo) break; } + msec = get_msec() - start_msec; + cleanup(); + + printf("Average framerate: %g\n", (float)nframes / ((float)msec / 1000.0f)); printf("Thank you for using Rayzor!\n"); return 0; } @@ -79,6 +98,8 @@ signal(SIGILL, sig); signal(SIGFPE, sig); + init_timer(128); + if(!novideo) { if(kb_init(32) == -1) { fprintf(stderr, "failed to initialize keyboard driver\n"); @@ -94,8 +115,8 @@ get_color_bits(&rbits, &gbits, &bbits); get_color_shift(&rshift, &gshift, &bshift); get_color_mask(&rmask, &gmask, &bmask); - bytespp = (int)ceil(fb_bpp / 8.0); + printlog("video resolution: %dx%d\n", fb_width, fb_height); printlog("bpp: %d (%d %d %d)\n", fb_bpp, rbits, gbits, bbits); printlog("shift: %d %d %d\n", rshift, gshift, bshift); printlog("mask: %x %x %x\n", rmask, gmask, bmask); @@ -107,9 +128,8 @@ } else { logger_output(stdout); printlog("novideo (debug) mode\n"); - fb_bpp = 32; + fb_bpp = 24; rbits = gbits = bbits = 8; - bytespp = 3; } fb_pixels = new uint32_t[fb_width * fb_height * 4]; @@ -165,8 +185,12 @@ } if(!novideo) { + wait_vsync(); +#ifdef USE_ASM_SWAPBUF + swap_buffers_asm(fb, fb_pixels, fb_width, fb_height, fb_bpp); +#else swap_buffers(); - wait_vsync(); +#endif } } @@ -175,9 +199,9 @@ (((g) << gshift) & gmask) | \ (((b) << bshift) & bmask)) -#define UNPACK_RED(c) (((c) >> 16) & 0xff) +#define UNPACK_RED(c) ((c) & 0xff) #define UNPACK_GREEN(c) (((c) >> 8) & 0xff) -#define UNPACK_BLUE(c) ((c) & 0xff) +#define UNPACK_BLUE(c) (((c) >> 16) & 0xff) static void swap_buffers() { @@ -192,12 +216,13 @@ case 24: { unsigned char *dest = (unsigned char*)fb; - for(int i=0; ihandle_keyboard(key, true); // TODO also generate release events... } } @@ -285,6 +318,9 @@ } prev_mx = mx; prev_my = my; + + mouse_x = mx; + mouse_y = my; } diff -r 70e332156d02 -r 235c8b764c0b src/swapbuf.asm --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/swapbuf.asm Thu Apr 10 08:03:52 2014 +0300 @@ -0,0 +1,46 @@ +; vim:set ft=nasm: + segment code use32 + + ; void swap_buffers_asm(void *dest, void *src, int xsz, int ysz, int bpp) + ; dest -> eax + ; src -> edx + ; xsz -> ebx + ; ysz -> ecx + ; bpp -> [ebp + 8] (after pushing ebp) + global swap_buffers_asm_ +swap_buffers_asm_: + push ebp + mov ebp, esp + + mov edi, eax ; let's hold dest ptr in edi, frees up eax + mov esi, edx ; let's hold src ptr in esi, frees up edx + ; calculate pixel count -> ecx, frees up ebx + mov eax, ebx + mul ecx + mov ecx, eax ; now ecx = xsz * ysz + + mov eax, [ebp + 8] ; eax <- bpp + cmp eax, 32 + je .bpp32 + cmp eax, 24 + je .bpp24 + cmp eax, 16 + je .bpp16 + ; invalid bpp, ignore + jmp .done + +.bpp32: ; 32bit block transfer, no conversion + rep movsd ; esi, edi, and ecx already loaded, just go... + jmp .done + +.bpp24: ; 32bpp -> 24bpp conversion (LSB-first), 1 byte overrun! + movsd ; transfer a full 32bit chunk and inc esi,edi by 4 + dec edi ; backtrack dest one byte after last transfer + dec ecx + jnz .bpp24 + jmp .done + +.bpp16: ; fuck 16bpp for now (TODO) +.done: + pop ebp + ret diff -r 70e332156d02 -r 235c8b764c0b src/timer.h --- a/src/timer.h Thu Apr 10 02:31:31 2014 +0300 +++ b/src/timer.h Thu Apr 10 08:03:52 2014 +0300 @@ -18,6 +18,10 @@ #ifndef TIMER_H_ #define TIMER_H_ +#ifdef __cplusplus +extern "C" { +#endif + /* expects the required timer resolution in hertz * if res_hz is 0, the current resolution is retained */ @@ -26,4 +30,8 @@ void reset_timer(void); unsigned long get_msec(void); +#ifdef __cplusplus +} +#endif + #endif /* TIMER_H_ */ diff -r 70e332156d02 -r 235c8b764c0b util/fixcase --- a/util/fixcase Thu Apr 10 02:31:31 2014 +0300 +++ b/util/fixcase Thu Apr 10 08:03:52 2014 +0300 @@ -1,6 +1,6 @@ #!/bin/sh -src=`find \( -iname '*.c' -o -iname '*.cc' -o -iname '*.h' -o -iname '*.inl' \)` +src=`find \( -iname '*.c' -o -iname '*.cc' -o -iname '*.h' -o -iname '*.inl' -o -iname '*.asm' \)` for i in $src util/*; do if echo $i | grep '[A-Z]' >/dev/null; then fixed=`echo $i | tr '[:upper:]' '[:lower:]'`