# HG changeset patch # User John Tsiombikas # Date 1280837219 -3600 # Node ID 85fd61f374d968a74d5e50c415955af5e0944c7c # Parent d9a1bab1c3f57ded4f241a6ae43a062343f6eb68 fixed the bloody intersection bug diff -r d9a1bab1c3f5 -r 85fd61f374d9 Makefile --- a/Makefile Sat Jul 31 22:23:57 2010 +0100 +++ b/Makefile Tue Aug 03 13:06:59 2010 +0100 @@ -1,5 +1,6 @@ src = $(wildcard src/*.cc) obj = $(src:.cc=.o) +dep = $(obj:.o=.d) bin = test CXX = g++ @@ -17,6 +18,11 @@ $(bin): $(obj) $(CXX) -o $@ $(obj) $(LDFLAGS) +-include $(dep) + +%.d: %.cc + @$(CPP) $(CXXFLAGS) -MM -MT $(@:.d=.o) $< >$@ + .PHONY: clean clean: - rm -f $(obj) $(bin) + rm -f $(obj) $(bin) $(dep) diff -r d9a1bab1c3f5 -r 85fd61f374d9 rt.cl --- a/rt.cl Sat Jul 31 22:23:57 2010 +0100 +++ b/rt.cl Tue Aug 03 13:06:59 2010 +0100 @@ -1,3 +1,5 @@ +/* vim: set ft=opencl:ts=4:sw=4 */ + struct RendInfo { int xsz, ysz; int num_faces, num_lights; @@ -7,19 +9,22 @@ struct Vertex { float4 pos; float4 normal; - float2 tex; + float4 tex; + float4 padding; }; struct Face { struct Vertex v[3]; float4 normal; int matid; + int padding[3]; }; struct Material { float4 kd, ks; float kr, kt; float spow; + float padding; }; struct Light { @@ -32,7 +37,7 @@ struct SurfPoint { float t; - float4 pos, norm; + float4 pos, norm, dbg; global const struct Face *obj; global const struct Material *mat; }; @@ -44,9 +49,8 @@ bool intersect(struct Ray ray, global const struct Face *face, struct SurfPoint *sp); float4 reflect(float4 v, float4 n); float4 transform(float4 v, global const float *xform); -struct Ray transform_ray(global const struct Ray *ray, global const float *xform); -float4 calc_bary(float4 pt, global const struct Face *face); - +struct Ray transform_ray(global const struct Ray *ray, global const float *xform, global const float *invtrans); +float4 calc_bary(float4 pt, global const struct Face *face, float4 norm); kernel void render(global float4 *fb, global const struct RendInfo *rinf, @@ -54,11 +58,12 @@ global const struct Material *matlib, global const struct Light *lights, global const struct Ray *primrays, - global const float *xform) + global const float *xform, + global const float *invtrans) { int idx = get_global_id(0); - struct Ray ray = transform_ray(primrays + idx, xform); + struct Ray ray = transform_ray(primrays + idx, xform, invtrans); struct SurfPoint sp, sp0; sp0.t = FLT_MAX; @@ -81,61 +86,86 @@ float4 shade(struct Ray ray, struct SurfPoint sp, global const struct Light *lights, int num_lights) { + float4 norm = sp.norm; + bool entering = true; + + if(dot(ray.dir, norm) >= 0.0) { + norm = -norm; + entering = false; + } + float4 dcol = (float4)(0, 0, 0, 0); float4 scol = (float4)(0, 0, 0, 0); for(int i=0; ispow); dcol += sp.mat->kd * diff * lights[i].color; - scol += sp.mat->ks * spec * lights[i].color; + //scol += sp.mat->ks * spec * lights[i].color; } return dcol + scol; } +float dot3(float4 a, float4 b) +{ + return a.x * b.x + a.y * b.y + a.z * b.z; +} + + bool intersect(struct Ray ray, global const struct Face *face, struct SurfPoint *sp) { - float ndotdir = dot(face->normal, ray.dir); + float4 origin = ray.origin; + float4 dir = ray.dir; + float4 norm = face->normal; + + float ndotdir = dot3(dir, norm); + if(fabs(ndotdir) <= EPSILON) { return false; } float4 pt = face->v[0].pos; - float4 vec = pt - ray.origin; + float4 vec = pt - origin; - float ndotvec = dot(face->normal, vec); + float ndotvec = dot3(norm, vec); float t = ndotvec / ndotdir; if(t < EPSILON || t > 1.0) { return false; } - pt = ray.origin + ray.dir * t; + pt = origin + dir * t; - float4 bc = calc_bary(pt, face); + if(pt.w < 0.0) return false; + + + float4 bc = calc_bary(pt, face, norm); float bc_sum = bc.x + bc.y + bc.z; - if(bc_sum < -EPSILON || bc_sum > 1.0) { + if(bc_sum < 0.0 || bc_sum > 1.0 + EPSILON) { return false; + bc *= 1.2; } sp->t = t; sp->pos = pt; - sp->norm = face->normal; + sp->norm = norm; sp->obj = face; + sp->dbg = bc; return true; } float4 reflect(float4 v, float4 n) { - return 2.0f * dot(v, n) * n - v; + float4 res = 2.0f * dot(v, n) * n - v; + return res; } float4 transform(float4 v, global const float *xform) @@ -144,33 +174,28 @@ res.x = v.x * xform[0] + v.y * xform[4] + v.z * xform[8] + xform[12]; res.y = v.x * xform[1] + v.y * xform[5] + v.z * xform[9] + xform[13]; res.z = v.x * xform[2] + v.y * xform[6] + v.z * xform[10] + xform[14]; - res.w = 1.0; + res.w = 0.0; return res; } -struct Ray transform_ray(global const struct Ray *ray, global const float *xform) +struct Ray transform_ray(global const struct Ray *ray, global const float *xform, global const float *invtrans) { struct Ray res; - float rot[16]; - - for(int i=0; i<16; i++) { - rot[i] = xform[i]; - } - rot[3] = rot[7] = rot[11] = rot[12] = rot[13] = rot[14] = 0.0f; - rot[15] = 1.0f; - res.origin = transform(ray->origin, xform); - res.dir = transform(ray->dir, xform); + res.dir = transform(ray->dir, invtrans); return res; } -float4 calc_bary(float4 pt, global const struct Face *face) +float4 calc_bary(float4 pt, global const struct Face *face, float4 norm) { - float4 bc = {0, 0, 0, 0}; + float4 bc = (float4)(0, 0, 0, 0); - float4 vi = face->v[1].pos - face->v[0].pos; - float4 vj = face->v[2].pos - face->v[0].pos; - float area = fabs(dot(cross(vi, vj), face->normal) / 2.0); + // calculate area of the whole triangle + float4 v1 = face->v[1].pos - face->v[0].pos; + float4 v2 = face->v[2].pos - face->v[0].pos; + float4 xv1v2 = cross(v1, v2); + + float area = fabs(dot3(xv1v2, norm)) * 0.5; if(area < EPSILON) { return bc; } @@ -179,10 +204,14 @@ float4 pv1 = face->v[1].pos - pt; float4 pv2 = face->v[2].pos - pt; - // calculate the areas of each sub-triangle - float a0 = fabs(dot(cross(pv1, pv2), face->normal) / 2.0); - float a1 = fabs(dot(cross(pv2, pv0), face->normal) / 2.0); - float a2 = fabs(dot(cross(pv0, pv1), face->normal) / 2.0); + // calculate the area of each sub-triangle + float4 x12 = cross(pv1, pv2); + float4 x20 = cross(pv2, pv0); + float4 x01 = cross(pv0, pv1); + + float a0 = fabs(dot3(x12, norm)) * 0.5; + float a1 = fabs(dot3(x20, norm)) * 0.5; + float a2 = fabs(dot3(x01, norm)) * 0.5; bc.x = a0 / area; bc.y = a1 / area; diff -r d9a1bab1c3f5 -r 85fd61f374d9 src/clray.cc --- a/src/clray.cc Sat Jul 31 22:23:57 2010 +0100 +++ b/src/clray.cc Tue Aug 03 13:06:59 2010 +0100 @@ -8,6 +8,7 @@ #include #endif #include "rt.h" +#include "matrix.h" void cleanup(); void disp(); @@ -17,13 +18,14 @@ void motion(int x, int y); bool write_ppm(const char *fname, float *fb, int xsz, int ysz); -static float *fb; static int xsz, ysz; static bool need_update = true; static float cam_theta, cam_phi = 25.0; static float cam_dist = 10.0; +static bool dbg_glrender; + int main(int argc, char **argv) { glutInitWindowSize(800, 600); @@ -40,8 +42,7 @@ glutMouseFunc(mouse); glutMotionFunc(motion); - fb = new float[xsz * ysz * 4]; - if(!init_renderer(xsz, ysz, fb)) { + if(!init_renderer(xsz, ysz)) { return 1; } atexit(cleanup); @@ -60,42 +61,60 @@ void cleanup() { - delete [] fb; destroy_renderer(); } +static Matrix4x4 mat, inv_mat, inv_trans; + void disp() { glMatrixMode(GL_MODELVIEW); glLoadIdentity(); if(need_update) { - float mat[16]; + glPushMatrix(); + glRotatef(-cam_theta, 0, 1, 0); + glRotatef(-cam_phi, 1, 0, 0); + glTranslatef(0, 0, cam_dist); - glPushMatrix(); - glRotatef(cam_theta, 0, 1, 0); - glRotatef(cam_phi, 1, 0, 0); - glTranslatef(0, 0, -cam_dist); + glGetFloatv(GL_MODELVIEW_MATRIX, mat.m); - glGetFloatv(GL_MODELVIEW_MATRIX, mat); - set_xform(mat); + inv_mat = mat; + inv_mat.invert(); + + /*inv_trans = inv_mat; + inv_trans.transpose();*/ + inv_trans = mat; + inv_trans.m[3] = inv_trans.m[7] = inv_trans.m[11] = 0.0; + inv_trans.m[12] = inv_trans.m[13] = inv_trans.m[14] = 0.0; + inv_trans.m[15] = 1.0; + + set_xform(mat.m, inv_trans.m); glPopMatrix(); - render(); - glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, xsz, ysz, GL_RGBA, GL_FLOAT, fb); + if(!render()) { + exit(1); + } need_update = false; } - glEnable(GL_TEXTURE_2D); + if(dbg_glrender) { + glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); + glLoadMatrixf(inv_mat.m); + dbg_render_gl(); + } else { + glEnable(GL_TEXTURE_2D); - glBegin(GL_QUADS); - glTexCoord2f(0, 1); glVertex2f(-1, -1); - glTexCoord2f(1, 1); glVertex2f(1, -1); - glTexCoord2f(1, 0); glVertex2f(1, 1); - glTexCoord2f(0, 0); glVertex2f(-1, 1); - glEnd(); + glBegin(GL_QUADS); + glColor3f(1, 1, 1); + glTexCoord2f(0, 1); glVertex2f(-1, -1); + glTexCoord2f(1, 1); glVertex2f(1, -1); + glTexCoord2f(1, 0); glVertex2f(1, 1); + glTexCoord2f(0, 0); glVertex2f(-1, 1); + glEnd(); - glDisable(GL_TEXTURE_2D); + glDisable(GL_TEXTURE_2D); + } glutSwapBuffers(); } @@ -116,14 +135,16 @@ case 27: exit(0); - case 's': - if(write_ppm("shot.ppm", fb, xsz, ysz)) { - printf("captured screenshot shot.ppm\n"); - } + case 'r': + need_update = true; + glutPostRedisplay(); break; - case 'r': - need_update = true; + case 'd': + dbg_glrender = !dbg_glrender; + if(dbg_glrender) { + printf("DEBUG GL RENDER\n"); + } glutPostRedisplay(); break; @@ -160,7 +181,7 @@ cam_theta += dx * ROT_SCALE; cam_phi += dy * ROT_SCALE; - if(cam_phi < -89) cam_phi = 89; + if(cam_phi < -89) cam_phi = -89; if(cam_phi > 89) cam_phi = 89; need_update = true; diff -r d9a1bab1c3f5 -r 85fd61f374d9 src/mesh.h --- a/src/mesh.h Sat Jul 31 22:23:57 2010 +0100 +++ b/src/mesh.h Tue Aug 03 13:06:59 2010 +0100 @@ -6,19 +6,22 @@ struct Vertex { float pos[4]; float normal[4]; - float tex[2]; + float tex[4]; + float padding[4]; }; struct Face { Vertex v[3]; float normal[4]; int matid; + int padding[3]; }; struct Material { float kd[4], ks[4]; float kr, kt; float spow; + float padding; }; struct Mesh { diff -r d9a1bab1c3f5 -r 85fd61f374d9 src/ocl.cc --- a/src/ocl.cc Sat Jul 31 22:23:57 2010 +0100 +++ b/src/ocl.cc Tue Aug 03 13:06:59 2010 +0100 @@ -78,9 +78,14 @@ { int err; cl_mem mem; + cl_mem_flags flags = rdwr | CL_MEM_ALLOC_HOST_PTR; + if(buf) { + flags |= CL_MEM_COPY_HOST_PTR; + } - if(!(mem = clCreateBuffer(ctx, rdwr | CL_MEM_USE_HOST_PTR, sz, buf, &err))) { + + if(!(mem = clCreateBuffer(ctx, flags, sz, buf, &err))) { fprintf(stderr, "failed to create memory buffer: %s\n", clstrerror(err)); return 0; } @@ -88,13 +93,13 @@ CLMemBuffer *mbuf = new CLMemBuffer; mbuf->mem = mem; mbuf->size = sz; + mbuf->ptr = 0; return mbuf; } void destroy_mem_buffer(CLMemBuffer *mbuf) { if(mbuf) { - clReleaseMemObject(mbuf->mem); delete mbuf; } @@ -104,6 +109,12 @@ { if(!mbuf) return 0; +#ifndef NDEBUG + if(mbuf->ptr) { + fprintf(stderr, "WARNING: map_mem_buffer called on already mapped buffer\n"); + } +#endif + int err; mbuf->ptr = clEnqueueMapBuffer(cmdq, mbuf->mem, 1, rdwr, 0, mbuf->size, 0, 0, 0, &err); if(!mbuf->ptr) { @@ -117,6 +128,7 @@ { if(!mbuf || !mbuf->ptr) return; clEnqueueUnmapMemObject(cmdq, mbuf->mem, mbuf->ptr, 0, 0, 0); + mbuf->ptr = 0; } bool write_mem_buffer(CLMemBuffer *mbuf, size_t sz, void *src) diff -r d9a1bab1c3f5 -r 85fd61f374d9 src/ocl.h --- a/src/ocl.h Sat Jul 31 22:23:57 2010 +0100 +++ b/src/ocl.h Tue Aug 03 13:06:59 2010 +0100 @@ -27,7 +27,7 @@ void *ptr; }; -CLMemBuffer *create_mem_buffer(int rdwr, size_t sz, void *buf); +CLMemBuffer *create_mem_buffer(int rdwr, size_t sz, void *buf = 0); void destroy_mem_buffer(CLMemBuffer *mbuf); void *map_mem_buffer(CLMemBuffer *mbuf, int rdwr); @@ -72,7 +72,7 @@ bool set_argi(int arg, int val); bool set_argf(int arg, float val); - bool set_arg_buffer(int arg, int rdwr, size_t sz, void *buf); + bool set_arg_buffer(int arg, int rdwr, size_t sz, void *buf = 0); CLMemBuffer *get_arg_buffer(int arg); bool build(); diff -r d9a1bab1c3f5 -r 85fd61f374d9 src/rt.cc --- a/src/rt.cc Sat Jul 31 22:23:57 2010 +0100 +++ b/src/rt.cc Tue Aug 03 13:06:59 2010 +0100 @@ -2,36 +2,43 @@ #include #include #include + +#ifndef __APPLE__ +#include +#include +#else +#include +#include +#endif + #include "ocl.h" #include "mesh.h" -#ifdef __GNUC__ -#define PACKED __attribute__((packed)) -#else -#define PACKED -#endif - -#ifdef _MSC_VER -#pragma push(pack, 1) -#endif +// kernel arguments +enum { + KARG_FRAMEBUFFER, + KARG_RENDER_INFO, + KARG_FACES, + KARG_MATLIB, + KARG_LIGHTS, + KARG_PRIM_RAYS, + KARG_XFORM, + KARG_INVTRANS_XFORM +}; struct RendInfo { int xsz, ysz; int num_faces, num_lights; int max_iter; -} PACKED; +}; struct Ray { float origin[4], dir[4]; -} PACKED; +}; struct Light { float pos[4], color[4]; -} PACKED; - -#ifdef _MSC_VER -#pragma pop(pack) -#endif +}; static Ray get_primary_ray(int x, int y, int w, int h, float vfov_deg); @@ -42,39 +49,36 @@ static Face faces[] = { {/* face0 */ { - {{-1, 0, 0, 1}, {0, 0, -1, 1}, {0, 0}}, - {{0, 1, 0, 1}, {0, 0, -1, 1}, {0, 0}}, - {{1, 0, 0, 1}, {0, 0, -1, 1}, {0, 0}} + {{-1, 0, 0, 0}, {0, 0, -1, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, + {{0, 1, 0, 0}, {0, 0, -1, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, + {{1, 0, 0, 0}, {0, 0, -1, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}} }, - {0, 0, -1, 1}, 0 + {0, 0, -1, 0}, 0, {0, 0, 0} }, {/* face1 */ { - {{-5, 0, -3, 1}, {0, 0, -1, 1}, {0, 0}}, - {{0, 0, 3, 1}, {0, 0, -1, 1}, {0, 0}}, - {{5, 0, -3, 1}, {0, 0, -1, 1}, {0, 0}} + {{-5, 0, -3, 0}, {0, 1, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, + {{0, 0, 3, 0}, {0, 1, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, + {{5, 0, -3, 0}, {0, 1, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}} }, - {0, 0, -1, 1}, 1 + {0, 1, 0, 0}, 1, {0, 0, 0} } }; static Material matlib[] = { - {{1, 0, 0, 1}, {1, 1, 1, 1}, 0, 0, 60.0}, - {{0.2, 0.8, 0.3, 1}, {0, 0, 0, 0}, 0, 0, 0} + {{1, 0, 0, 1}, {1, 1, 1, 1}, 0, 0, 60.0, 0}, + {{0.2, 0.8, 0.3, 1}, {0, 0, 0, 0}, 0, 0, 0, 0} }; static Light lightlist[] = { - {{-10, 10, -20, 1}, {1, 1, 1, 1}} + {{-10, 10, -20, 0}, {1, 1, 1, 1}} }; -static float xform[16] = { - 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1 -}; static RendInfo rinf; -bool init_renderer(int xsz, int ysz, float *fb) +bool init_renderer(int xsz, int ysz) { // render info rinf.xsz = xsz; @@ -99,13 +103,16 @@ } /* setup argument buffers */ - prog->set_arg_buffer(0, ARG_WR, xsz * ysz * 4 * sizeof(float), fb); - prog->set_arg_buffer(1, ARG_RD, sizeof rinf, &rinf); - prog->set_arg_buffer(2, ARG_RD, sizeof faces, faces); - prog->set_arg_buffer(3, ARG_RD, sizeof matlib, matlib); - prog->set_arg_buffer(4, ARG_RD, sizeof lightlist, lightlist); - prog->set_arg_buffer(5, ARG_RD, xsz * ysz * sizeof *prim_rays, prim_rays); - prog->set_arg_buffer(6, ARG_RD, sizeof xform, &xform); + prog->set_arg_buffer(KARG_FRAMEBUFFER, ARG_WR, xsz * ysz * 4 * sizeof(float)); + prog->set_arg_buffer(KARG_RENDER_INFO, ARG_RD, sizeof rinf, &rinf); + prog->set_arg_buffer(KARG_FACES, ARG_RD, sizeof faces, faces); + prog->set_arg_buffer(KARG_MATLIB, ARG_RD, sizeof matlib, matlib); + prog->set_arg_buffer(KARG_LIGHTS, ARG_RD, sizeof lightlist, lightlist); + prog->set_arg_buffer(KARG_PRIM_RAYS, ARG_RD, xsz * ysz * sizeof *prim_rays, prim_rays); + prog->set_arg_buffer(KARG_XFORM, ARG_RD, 16 * sizeof(float)); + prog->set_arg_buffer(KARG_INVTRANS_XFORM, ARG_RD, 16 * sizeof(float)); + + delete [] prim_rays; global_size = xsz * ysz; return true; @@ -113,7 +120,6 @@ void destroy_renderer() { - delete [] prim_rays; delete prog; } @@ -124,19 +130,63 @@ } CLMemBuffer *mbuf = prog->get_arg_buffer(0); - map_mem_buffer(mbuf, MAP_RD); + void *fb = map_mem_buffer(mbuf, MAP_RD); + glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, rinf.xsz, rinf.ysz, GL_RGBA, GL_FLOAT, fb); unmap_mem_buffer(mbuf); return true; } -void set_xform(float *matrix) +void dbg_render_gl() { - CLMemBuffer *mbuf = prog->get_arg_buffer(6); - assert(mbuf); + glPushAttrib(GL_ENABLE_BIT | GL_TRANSFORM_BIT); - assert(map_mem_buffer(mbuf, MAP_WR) == xform); - memcpy(xform, matrix, sizeof xform); - unmap_mem_buffer(mbuf); + glDisable(GL_TEXTURE_2D); + glEnable(GL_DEPTH_TEST); + + glMatrixMode(GL_PROJECTION); + glPushMatrix(); + glLoadIdentity(); + gluPerspective(45.0, (float)rinf.xsz / (float)rinf.ysz, 0.5, 1000.0); + + glBegin(GL_TRIANGLES); + for(int i=0; ikd[0], mat->kd[1], mat->kd[2]); + + for(int j=0; j<3; j++) { + float *pos = faces[i].v[j].pos; + glVertex3f(pos[0], pos[1], pos[2]); + } + } + glEnd(); + + glPopMatrix(); + glPopAttrib(); +} + +void set_xform(float *matrix, float *invtrans) +{ + CLMemBuffer *mbuf_xform = prog->get_arg_buffer(KARG_XFORM); + CLMemBuffer *mbuf_invtrans = prog->get_arg_buffer(KARG_INVTRANS_XFORM); + assert(mbuf_xform && mbuf_invtrans); + + float *mem = (float*)map_mem_buffer(mbuf_xform, MAP_WR); + memcpy(mem, matrix, 16 * sizeof *mem); + printf("-- xform:\n"); + for(int i=0; i<16; i++) { + printf("%2.3f\t", mem[i]); + if(i % 4 == 3) putchar('\n'); + } + unmap_mem_buffer(mbuf_xform); + + mem = (float*)map_mem_buffer(mbuf_invtrans, MAP_WR); + memcpy(mem, invtrans, 16 * sizeof *mem); + printf("-- inverse-transpose:\n"); + for(int i=0; i<16; i++) { + printf("%2.3f\t", mem[i]); + if(i % 4 == 3) putchar('\n'); + } + unmap_mem_buffer(mbuf_invtrans); } static Ray get_primary_ray(int x, int y, int w, int h, float vfov_deg) @@ -155,6 +205,6 @@ py *= 100.0; pz *= 100.0; - Ray ray = {{0, 0, 0, 1}, {px, py, pz, 1}}; + Ray ray = {{0, 0, 0, 1}, {px, py, -pz, 1}}; return ray; } diff -r d9a1bab1c3f5 -r 85fd61f374d9 src/rt.h --- a/src/rt.h Sat Jul 31 22:23:57 2010 +0100 +++ b/src/rt.h Tue Aug 03 13:06:59 2010 +0100 @@ -1,9 +1,11 @@ #ifndef RT_H_ #define RT_H_ -bool init_renderer(int xsz, int ysz, float *fb); +bool init_renderer(int xsz, int ysz); void destroy_renderer(); bool render(); -void set_xform(float *matrix); +void set_xform(float *matrix, float *invtrans); + +void dbg_render_gl(); #endif /* RT_H_ */