The code is available below.

Download this project: path_tracer_texture_mapping.tar.bz2

]]>Below is a test render of the Sponza model. Some of the features of the original model have been stripped for this render, leaving approximately 150,000 triangles.

Below is a render of the Dragon model available from the Stanford 3D Scanning Repository. The rendered model contained 100,000 triangles.

**Hyperplane Separation Theorem**

The hyperplane separation theorem is a theorem about disjoint, convex sets. For our purposes we will be applying the theorem to a combination of an axis-aligned bounding box and a triangle in three dimensions. Since we will consider both the triangle and the axis-aligned bounding box to be compact (and convex), then, provided these two sets are disjoint, we can locate two parallel hyperplanes between them separated by a gap. We need only one separating hyperplane to conclude that the triangle and axis-aligned bounding box do not intersect. Below is an image depicting this theorem in two dimensions. The green line is a separating axis and the black line is a separating line.

In three dimensions, we will have a separating axis and a separating plane. We have a few different contact situations between objects to concern ourselves with: face to face contact, face to edge contact, and edge to edge contact. Thus, we have a list of potential separating axes including the normals to the faces and the cross products of the combinations of an edge from one object with an edge from the other. The face to edge contacts are handled by the face normals. The axis-aligned bounding box has six faces, but three sets of two parallel faces, so we have three potential separating axes. The triangle normal is a fourth. The axis-aligned bounding box has 12 edges, but three sets of four parallel edges, and the triangle has three edges, yielding nine cross products for a total of 13 potential separating axes. We need only complete all thirteen tests to conclude that the axis-aligned bounding box and the triangle are not disjoint. However, if any particular tests yields a separating plane, we need not complete the remaining tests to conclude that the objects are disjoint.

We will first translate the triangle and the axis-aligned bounding box such that the center of the box is located at the origin. Below is some code to detect a separation. We need not be concerned with the direction of the projections, but only their magnitudes. Because the box is located at the origin, we define a radius, `r`

, based on the half dimensions of the box.

bool hyperplaneSeparation(__vector n, __vector p0, __vector p1, __vector p2, double halfWidth, double halfHeight, double halfDepth) { double _p0 = n * p0, _p1 = n * p1, _p2 = n * p2; double min = MIN(_p0, MIN(_p1, _p2)), max = MAX(_p0, MAX(_p1, _p2)); double r = halfWidth * fabs(n.x) + halfHeight * fabs(n.y) + halfDepth * fabs(n.z); return -r > max || r < min; }

The `buildTree()`

function in the project download uses this method.

**kd tree construction using the surface area heuristic**

kd tree construction using the surface area heuristic is a greedy algorithm. During the build process, we will attempt to compare the cost of splitting a node with the cost of not splitting. If the local cost of splitting is less than not splitting, we split the node. Otherwise, we convert the current node to a leaf. The function we will use to estimate the cost is given below [2],

\begin{align}

C_V(p) &= K_T + K_I\left( \frac{SA(V_L)}{SA(V)} T_L + \frac{SA(V_R)}{SA(V)} T_R \right) \\

C_{NS} &= K_IT \\

\end{align}

where \(K_T\) is the cost of a traversal, \(K_I\) is the triangle intersection cost, \(SA(V_L), SA(V_R), SA(V)\) are the surface areas of the left node, right node, and current node, respectively, \(T_L, T_R, T\) are the number of triangles in the left node, right node, and current node, respectively, \(C_V(p)\) is the cost of splitting the current node, and \(C_{NS}\) is the cost of not splitting the current node.

We have \(6T\) potential split positions comprising each of the three axes with the minimum and maximum value for each axis from each triangle. The algorithm presented here is similar to the \(O(n \cdot \log^2n)\) algorithm described in [2]. For each axis we push the minimum triangle coordinate to a list with an event, `PRIMITIVE_START`

, and the maximum coordinate with an event, `PRIMITIVE_END`

. The lists are then sorted based on the coordinate value, \(O(n \cdot \log n)\). For each split position, we will consider the triangle to reside in both nodes, so for the first split position we have \(T_L=1\) and \(T_R=T\). As we progress to the next split position, if that event is a `PRIMITIVE_START`

, we increment \(T_L\). If the event is a `PRIMITIVE_END`

, we decrement \(T_R\) *on the following pass*, since that event corresponds to a triangle that we are including in both nodes (a vertex lies on the split plane). We now have \(T_L\) and \(T_R\) for each split position, we can evaluate the surface areas based on the split position, and throw some estimates in for \(K_T\) and \(K_I\). On each pass we evaluate \(C_V(p)\) and retain the best cost and split position, \(p\). Once we have processed all potential split positions, we compare the best cost with \(C_{NS}\) and split the node if \(C_V(p) \lt C_{NS}\).

The project download generates the kd tree recursively host-side, transfers it to a structure of arrays, and passes it to the device. The `buildKdTree()`

function in the download allows you to pass a type parameter. This parameter can be `KD_EVEN`

, splitting each node in the center resulting in a binary space partition, `KD_MEDIAN`

, splitting each node at the object median, or `KD_SAH`

, splitting the node using surface area heuristics. Below are three visualizations of the tree structure for each type.

**stack-based traversal**

To implement a stack-based traversal of the kd tree, we first created a stack object below. The `__stack_element`

contains an `id`

to reference a node, and the \(t_{min}\) and \(t_{max}\) values for a ray, \(\vec{r} = \vec{o} + \hat{d}t\), passing through the node.

struct __stack_element { int id; double tmin, tmax; }; class __stack { public: __stack_element stack[32]; int count; __device__ __stack(); __device__ void push(int id, double tmin, double tmax); __device__ __stack_element pop(); __device__ bool empty(); }; __device__ __stack::__stack() : count(0) {} __device__ void __stack::push(int id, double tmin, double tmax) { this->stack[count].id = id; this->stack[count].tmin = tmin; this->stack[count].tmax = tmax; count++; } __device__ __stack_element __stack::pop() { count--; __stack_element se; se.id = this->stack[count].id; se.tmin = this->stack[count].tmin; se.tmax = this->stack[count].tmax; return se; } __device__ bool __stack::empty() { return this->count == 0; }

With the stack object, it was fairly straightforward to implement the algorithm below. See [3] and [4] for details. The algorithm descends through the tree, pushing the further nodes onto the stack. With the nearest nodes evaluated first, we can break early upon finding an intersection within the bounds.

intersection = none; if (ray intersects root node) { stack.push(root node, tmin, tmax); while (!stack.empty() && !intersection) { (node, tmin, tmax) = stack.pop(); while (!node.isLeaf()) { tsplit = (node.split - ray.origin[node.axis]) / ray.direction[node.axis]; if (node.split - ray.origin[node.axis] >= 0) { first = node.left; second = node.right; } else { first = node.right; second = node.left; } if (tsplit >= tmax || tsplit < 0) node = first; else if (tsplit <= tmin) node = second; else { stack.push(second, tsplit, tmax); node = first; tmax = tsplit; } } foreach (triangle in node) if (ray intersects triangle) intersection = nearest intersection; if (nearest intersection > tmax) intersection = none; } }

Download the project and have a look at the code. Let me know if you have any thoughts.

Download this project: path_tracer.tar.bz2

References:

1. Akenine-MÃ¶ller, Tomas. Fast 3D triangle-box overlap testing. *In ACM SIGGRAPH 2005 Courses*, ACM. Los Angeles, California. 2005.

2. Wald, Ingo, and Havran, Vlastimil. On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N). *IN PROCEEDINGS OF THE 2006 IEEE SYMPOSIUM ON INTERACTIVE RAY TRACING*. 2006.

3. Wald, Ingo. 2004. Realtime Ray Tracing and Interactive Global Illumination. PhD thesis, Saarland University.

4. Horn, Daniel Reiter, Sugerman, Jeremy, Houston, Mike, and Hanrahan, Pat. 2007. Interactive k-d tree GPU raytracing. *In Proceedings of the 2007 symposium on Interactive 3D graphics and games*, ACM. Seattle, Washington.

5. Havran, Vlastimil. 2000. Heuristic Ray Shooting Algorithms. Ph.D. Thesis, Czech Technical University in Prague.

**Thin lens**

We first reworked the camera model using the thin lens equation. Below, \(f\) is the focal length, \(d\) is the distance to the focal plane, and \(i\) is the distance to the image plane.

\begin{align}

\frac{1}{f} &= \frac{1}{d} + \frac{1}{i} \\

i &= \frac{1}{\frac{1}{f} - \frac{1}{d}} = \frac{fd}{d-f} \\

\end{align}

For a 50mm lens focused at 10m the image plane is located at approximately 5.025mm. For a lens set to f/8 this yields a radius, \(r\), of the entrance pupil of,

\begin{align}

r &= \frac{1}{2} \cdot \frac{f}{8} \\

&= \frac{1}{2} \cdot \frac{50mm}{8} &= 3.125mm \\

\end{align}

In the code we specify the focal length, aperture, and distance to the focal plane. From this we evaluate the distance to the image plane and the aperture size. The kernel is set to simulate a a 36mm wide sensor by a height evaluated appropriately for the given ratio. We fire rays from a location on the sensor through the origin to a point, \(\vec{p}\), on the focal plane. We then jitter the origin within the disc defined by the aperture radius. If the new offset is \(\vec{o}\), the ray direction is \(\vec{r}=\vec{p}-\vec{o}\), and we sample the ray \(\vec{o} + t\vec{r}\).

**Fresnel reflection**

Next, we added support for Fresnel reflection. This was a straightforward modification to our refractive material. We simply find the reflection coefficient, \(R\), for unpolarized light given below,

\begin{align}

R &= \frac{R_s+R_p}{2} \\

R_s &= \left( \frac{-n_1 \hat{r} \cdot \hat {n} - n_2 \sqrt{1 - \frac{n_1^2}{n_2^2} \left[1 - (\hat{n} \cdot \hat{r})^2 \right]}}{-n_1 \hat{r} \cdot \hat {n} + n_2 \sqrt{1 - \frac{n_1^2}{n_2^2} \left[1 - (\hat{n} \cdot \hat{r})^2 \right]}} \right)^2\\

R_p &= \left( \frac{n_1 \sqrt{1 - \frac{n_1^2}{n_2^2} \left[1 - (\hat{n} \cdot \hat{r})^2 \right]} - -n_2 \hat{r} \cdot \hat {n}}{n_1 \sqrt{1 - \frac{n_1^2}{n_2^2} \left[1 - (\hat{n} \cdot \hat{r})^2 \right]} + -n_2 \hat{r} \cdot \hat {n}} \right)^2\\

\end{align}

Note that all vectors above are unit vectors. We next generate a uniform random variable on the interval \([0,1]\) and reflect the ray if this number is less than \(R\). We refract and transmit otherwise.

**Smooth shading**

In this post we discussed triangle intersections, so we have \(s\) and \(t\) for our point of intersection, \(\vec{p}\),

\begin{align}

\vec{p} &= \vec{p}_0 + s(\vec{p}_1 - \vec{p}_0) + t(\vec{p}_2 - \vec{p}_0) \\

\end{align}

Provided we have a normal for each vertex, we can exploit the \(s\) and \(t\) evaluations and use them for interpolating the normals,

\begin{align}

\vec{n} &= \vec{n}_0 + s(\vec{n}_1 - \vec{n}_0) + t(\vec{n}_2 - \vec{n}_0) \\

\hat{n} &= \frac{\vec{n}}{\left|\left|\vec{n}\right|\right|}

\end{align}

**Texture mapping the plane primitive**

The last addition this time around was to add texture mapping support for the plane primitive. The general idea was to define two linearly-independent vectors that span the plane. With those two vectors and a point on the plane, we can find the \(s\) and \(t\) coordinates for our point of intersection as we do for the triangle primitive. Since the texture is repeating we find the coordinates, \(s'\) and \(t'\),

\begin{align}

s' &= s - \lfloor s \rfloor \\

t' &= t - \lfloor t \rfloor \\

\end{align}

We now have appropriate texture coordinates, \(s'\) and \(t'\), that both belong to the interval \([0,1]\). These are used as offsets into our texture.

Download this project: pathtracer_dof_triangles_fresnel_texture_smooth.tar.bz2

]]>We will continue with the project we left off with in this post. We will attempt to add triangles to our list of primitives. Once we are able to render triangles, this opens the door to rendering full scale models. However, because models will contain upwards of thousands of triangles, we need to be able to organize those primitives effectively for intersection tests. For this we have implemented a rudimentary binary space partitioning. We will discuss towards the end what could be done to improve efficiency. Below are two renders.

In the post, A calibration method based on barycentric coordinates for multi-touch systems, we discussed barycentric coordinates. That concept will be used here for our triangle intersection tests. Our first job is to locate the point, \(\vec{p}\), where the ray intersects the plane in which the triangle lies (this was discussed a bit here). If a triangle is defined by the vertices, \(\vec{p}_0\), \(\vec{p}_1\), and \(\vec{p}_2\), the triangle normal can be given as \(\vec{n} = (\vec{p}_1-\vec{p}_0)\times(\vec{p}_2-\vec{p}_0)\). Once we have found the point, \(\vec{p}\), we evaluate the barycentric coordinates, \(s\) and \(t\), of the point relative to the triangle. These equations are given below, where \(\vec{v}_0 = \vec{p}_1-\vec{p}_0\) and \(\vec{v}_1 = \vec{p}_2 - \vec{p}_0\).

\begin{align}

s &= \frac{(\vec p \cdot \vec v_0)(\vec v_1 \cdot \vec v_1)-(\vec p \cdot \vec v_1)(\vec v_0 \cdot \vec v_1)}{(\vec v_0 \cdot \vec v_0)(\vec v_1 \cdot \vec v_1)-(\vec v_0 \cdot \vec v_1)^2}\\

t &= \frac{(\vec p \cdot \vec v_1)(\vec v_0 \cdot \vec v_0)-(\vec p \cdot \vec v_0)(\vec v_0 \cdot \vec v_1)}{(\vec v_0 \cdot \vec v_0)(\vec v_1 \cdot \vec v_1)-(\vec v_0 \cdot \vec v_1)^2}

\end{align}

Provided \(s\geq0\), \(t\geq0\), and \(s+t\leq1\), we can conclude that the point, \(\vec{p}\), lies inside the triangle. We can then reflect or transmit the ray appropriately depending on the material type.

This addendum to the path tracer project was relatively straight forward, but it does not scale well. For each ray we must find the nearest intersection, and for \(n\) primitives this amounts to \(n\) intersection tests on each ray bounce. We cannot afford to check each primitive on models containing thousands of primitives, so we have added a basic binary space partitioning.

The partitioning tree is generated host-side and transferred to the device. For this we have elected to represent our tree structure as a structure of arrays. Below is the structure as it stands. `depth`

represents the depth of a specific node, `minx`

, `miny`

, ... `maxz`

represent the bounds of the node, `child0`

and `child1`

represent the array index of the two child nodes, `parent`

holds the index of the parent node, `id`

is the index of the node, and `leaf_id`

is a separate indexing that applies only to the leaf nodes. The `leaf_id`

gives us an offset into the `objects`

array which, itself, applies only to the leaf nodes. `n_objects`

applies to all nodes and represents the number of objects that pass through a node. Lastly, `max_depth`

holds the depth of our tree, `size`

is the number of nodes, and `leaf_size`

is the number of leaf nodes.

struct _bounding_box { unsigned short *depth, *depth_device; double *minx, *miny, *minz, *maxx, *maxy, *maxz, *minx_device, *miny_device, *minz_device, *maxx_device, *maxy_device, *maxz_device; short *child0, *child1, *child0_device, *child1_device; short *parent, *parent_device; short *id, *id_device; short *leaf_id, *leaf_id_device; unsigned short *n_objects, *n_objects_device; unsigned short *objects, *objects_device; unsigned short max_depth; unsigned short size, leaf_size; };

If we have a tree with depth 3, then \(\text{size} = 2^{(3+1)}-1 = 15\) and \(\text{child_size} = 2^{3} = 8\). Thus, we would have \(15\) nodes in total and \(8\) leaf nodes.

The idea was to first evaluate (after the camera transformation) the minimum and maximum axes values of the axis-aligned box that bounds every primitive in our scene. These values are passed to our tree-building function, and the tree is generated by splitting along the major axis. If the dimensions of our root node are \((1,2,3)\), we would first split along the \(z\)-axis resulting in two children of size \((1,2,1.5)\). The second splits would occur along the \(y\)-axis resulting in 4 nodes of size \((1,1,1.5)\).

Once we reach a leaf node, we cycle through all of the primitives in our scene seeking those primitives that pass through the leaf node. Once the tree is built and all leaf nodes have been processed, we propagate back the number of objects in each child node to its parent. In the code we have also merged the objects from child nodes to the current node if the number of objects is below a certain threshold. There would be no sense in testing 16 child nodes if they all contain the same primitive.

When testing child nodes for the containment of primitives, we have cheated a bit. For one we have not added any plane primitives to the partitioning. We simply add these primitives to the list of objects we test for intersections. For the sphere primitive we have evaluated the radius of the bounding sphere of the given tree node and compared it with the radius of the primitive. If the distance between the sphere center and the box center is less than the sum of the radii, we include the primitive as passing through the tree node. Consequently, this will include spheres that should not necessarily belong to the node, but it will include all the spheres that should. Lastly, when testing for the containment of triangle primitives in a given tree node, we evaluate the axis-aligned bounding box of the primitive and test for overlap between the two bounding boxes. Again, this will potentially include many more primitives than it should but will capture all that is necessary.

Our ray sampling procedure has been updated to query for intersections with the bounding box. If we find the ray hits the root node, we then query the two child nodes. If the ray hits a child node, we check the children of that child node. We continue like this until we reach a leaf node. Upon reaching a leaf node, we add the objects contained in that leaf node to the list of objects to test against for intersections. Below is the function for testing whether a ray intersects an axis-aligned bounding box. There are a few cases. If the node does not contain any primitives, there is no point in testing any furthur (no children will contain any primitives either). Additionally, the ray could originate inside the bounding box, and, lastly, we check for intersection with the left, right, bottom, top, rear, and front box faces.

__device__ bool rayIntersects_device(_bounding_box& b, unsigned short index, __ray r) { // contains objects if (b.n_objects_device[index] < 1) return false; // containment of ray origin if (r.origin.x >= b.minx_device[index] && r.origin.x <= b.maxx_device[index] && r.origin.y >= b.miny_device[index] && r.origin.y <= b.maxy_device[index] && r.origin.z >= b.minz_device[index] && r.origin.z <= b.maxz_device[index]) return true; // intsection tests if (r.origin.x < b.minx_device[index] && r.direction.x > 0) { // check left face intersection double t = (-b.minx_device[index] + r.origin.x) / -r.direction.x; //double x = r.origin.x + t * r.direction.x; double y = r.origin.y + t * r.direction.y; double z = r.origin.z + t * r.direction.z; if (y >= b.miny_device[index] && y <= b.maxy_device[index] && z >= b.minz_device[index] && z <= b.maxz_device[index]) return true; } if (r.origin.x > b.maxx_device[index] && r.direction.x < 0) { // check right face intersection double t = (b.maxx_device[index] - r.origin.x) / r.direction.x; //double x = r.origin.x + t * r.direction.x; double y = r.origin.y + t * r.direction.y; double z = r.origin.z + t * r.direction.z; if (y >= b.miny_device[index] && y <= b.maxy_device[index] && z >= b.minz_device[index] && z <= b.maxz_device[index]) return true; } if (r.origin.y < b.miny_device[index] && r.direction.y > 0) { // check bottom face intersection double t = (-b.miny_device[index] + r.origin.y) / -r.direction.y; double x = r.origin.x + t * r.direction.x; //double y = r.origin.y + t * r.direction.y; double z = r.origin.z + t * r.direction.z; if (x >= b.minx_device[index] && x <= b.maxx_device[index] && z >= b.minz_device[index] && z <= b.maxz_device[index]) return true; } if (r.origin.y > b.maxy_device[index] && r.direction.y < 0) { // check top face intersection double t = (b.maxy_device[index] - r.origin.y) / r.direction.y; double x = r.origin.x + t * r.direction.x; //double y = r.origin.y + t * r.direction.y; double z = r.origin.z + t * r.direction.z; if (x >= b.minx_device[index] && x <= b.maxx_device[index] && z >= b.minz_device[index] && z <= b.maxz_device[index]) return true; } if (r.origin.z < b.minz_device[index] && r.direction.z > 0) { // check rear face intersection double t = (-b.minz_device[index] + r.origin.z) / -r.direction.z; double x = r.origin.x + t * r.direction.x; double y = r.origin.y + t * r.direction.y; //double z = r.origin.z + t * r.direction.z; if (x >= b.minx_device[index] && x <= b.maxx_device[index] && y >= b.miny_device[index] && y <= b.maxy_device[index]) return true; } if (r.origin.z > b.maxz_device[index] && r.direction.z < 0) { // check front face intersection double t = (b.maxz_device[index] - r.origin.z) / r.direction.z; double x = r.origin.x + t * r.direction.x; double y = r.origin.y + t * r.direction.y; //double z = r.origin.z + t * r.direction.z; if (x >= b.minx_device[index] && x <= b.maxx_device[index] && y >= b.miny_device[index] && y <= b.maxy_device[index]) return true; } // no intersection return false; }

Below is the function that adds primitives to the hit list. These are primitives we must check directly for intersections. It was an attempt to avoid recursion and is fairly crude. It starts by testing the root node and continues to add indices on a bounding box hit. When a leaf node is reached, we add only those primitives that have not already been added.

__device__ short intersects_device(_bounding_box& b, int i, __ray r, short hit_list[]) { int index = 0, count = 1, indices[30000]; indices[index] = 0; short hit_count = 0; bool found = false; while (index < count && index < 30000) { i = indices[index++]; if (rayIntersects_device(b, i, r)) { if (b.depth_device[i] == b.max_depth) { for (int j = 0; j < b.n_objects_device[i]; j++) { short hit = b.objects_device[ b.leaf_id_device[i] * 10000 + j ]; found = false; for (int l = 0; l < hit_count; l++) { if (hit_list[l] == hit) { found = true; break; } } if (!found) hit_list[hit_count++] = hit; } } else { indices[count++] = b.child0_device[i]; indices[count++] = b.child1_device[i]; } } }; return hit_count; }

The `sampleRay`

function has been updated to use the `intersects_device`

method. It now loops over only those primitives that should be tested directly. Since we are handling planes directly, the project expects those planes to be added to the objects list first. The `sampleRay`

has a second loop for handling planes. Once a primitive other than a plane is found, it breaks from the loop.

Occasionally during testing, the kernel would timeout. The number of rays each kernel call is forced to handle has been reduced to help prevent this from occurring. A kernel call now handles a 2 by 2 grid of blocks sized 16 by 16. Thus, at the moment the kernel only handles 1024 pixels on each pass. We send an offset in both the \(x\) and \(y\)-directions to update the entire image over successive loops.

Blender was used to export 3D models in OBJ format. The project expects triangles and normals to be present in the OBJ file. When exporting do not forget to check "Include Normals" and "Triangulate Faces".

This project is fairly crude. Below is a list of some ideas that could be implemented to improve the efficiency of the project.

- kd-tree
- improved intersection testing
- ray-triangle intersections
- containment testing for spheres in nodes
- containment testing for triangles in nodes
- shared memory
- generating tree structure on device

Have a look at the project, and let me know if you have any questions or suggestions.

Download this project: pathtracer_dof_triangles.tar.bz2

]]>Essentially, we will define the distance to the focal plane and a blur radius. For each primary ray we find its intersection with the focal plane, \(\vec{p}\), and jitter the ray origin by an amount, \(\vec{d}\). We then define the new ray direction as \(\vec{r}=\vec{p}-\vec{d}\). Consequently, objects on the focal plane will appear in focus. Below is the addendum to the `kernel()`

function.

__vector dir = __vector(x - width / 2, -y + height / 2, 0 + width) + offset; __ray ray = { __vector(0, 0, 0), dir.unit() }; u1 = rand_device[i*width*height*3+index+1]; u2 = rand_device[i*width*height*3+index+2]; r1 = 2 * M_PI * u1; r2 = u2; offset = __vector(cos(r1)*r2, sin(r1)*r2, 0.0) * blur_radius; __vector p = ray.origin + dir * (focal_distance / width); ray.origin = ray.origin + offset; ray.direction = (p - ray.origin).unit();

Again, don't forget to update the `Makefile`

to reference the proper locations for the `libcudart.so`

and `libcurand.so`

libraries.

Download the updated project: pathtracer_dof.tar.bz2

]]>Below are two screen captures of this project in action.

This path tracer is basic and fairly crude and inefficient. I'll provide a brief overview of the code before we delve into some of the mathematics. The host code defines an abstract base class, `cObject`

, from which the `cPlane`

and `cSphere`

objects are derived. The base class includes the material type, color, emission color and type (plane or sphere) properties. The `applyCamera()`

virtual function is defined in the super classes and transforms the respective object into camera space.

The objects in camera space are passed to the device where the environment is rendered. The `runPathTracer()`

function in `pathtracer.cu`

generates some random numbers, executes the kernel, and retrieves the current frame. This frame is rendered to a texture during program execution and saved to a ppm file upon program termination.

The kernel function runs through our buffer, and for each buffer location four rays are shot out in each of the four quadrants surrounding the buffer location using cosine-weighted sampling. These four samples are averaged and added to the accumulation. The device function, `sampleRay()`

, is called on each ray. A maximum loop size is defined (e.g. 5 bounces), and sampling begins for the current ray.

The ray sampler loops over the maximum number of bounces. Within this loop, we loop over our objects seeking an intersection using the equations outlined below for spheres and planes. If an intersection is found (the nearest intersection), we set the values in our emission and color arrays and bounce the ray according to the material type (diffuse, specular, or refractive). Lastly, we apply the emission and color arrays to our final sample. If our final sample is \(\vec{s}_1\) and the emission and color values are \(\vec{e}_n\) and \(\vec{c}_n\), respectively, for \(n \in {1,2,...,m}\), where \(m\) is the bounce limit, the result would be,

\begin{align}

\vec{s}_{m} &= \vec{e}_m\\

\vec{s}_{n} &= \vec{e}_{n} + \vec{c}_{n} \circ \vec{s}_{n+1}\\

\end{align}

Below we will discuss some of the mathematics involved in the process before we mention interaction and conclude with a few notes.

**Sphere intersection**

Our path tracer will include support for spheres and planes. Below we have the equation for a sphere and a ray followed by the evaluation of the point of intersection, \(\vec{p}\). We have a point of intersection provided the determinant of the quadratic equation is positive. Lastly, we evaluate the surface normal by subtracting the sphere center from the point of intersection. Note that when we evaluate the roots of the quadratic, we will select the lesser of the two roots (the nearest point of intersection).

\begin{align}

(\vec{p} - \vec{c}) \cdot (\vec{p} - \vec{c}) &= r^2\\

\vec{r}(t) &= \vec{o} + \vec{r}t\\

(\vec{o} + \vec{r}t - \vec{c}) \cdot (\vec{o} + \vec{r}t - \vec{c}) &= r^2\\

(\vec{r}\cdot\vec{r})t^2 + 2 \cdot \vec{r} (\vec{o} - \vec{c}) t + (\vec{o} - \vec{c}) \cdot (\vec{o} - \vec{c}) - r^2 &= 0\\

\vec{n} &= \vec{p} - \vec{c}\\

\end{align}

**Plane intersection**

Below we have the equation for a plane followed by an evaluation of the point of intersection. Note that if the ray is parallel to the plane we have either no intersection or an unlimited number of intersections (the line is in the plane). Here we do not need to evaluate the normal, it is an inherent property of the plane.

\begin{align}

(\vec{p} - \vec{p}_0) \cdot \hat{n} &= 0\\

\vec{r}(t) &= \vec{o} + \vec{r}t\\

(\vec{o} + \vec{r}t - \vec{p}_0) \cdot \vec{n} &= 0\\

\end{align}

**Specular reflection**

The simplest of the three lighting models we will implement in this project, specular reflection gives objects a mirror-like quality. Incoming rays are reflected off the surface of an object in a direction uniquely defined by the incoming ray, \(\vec{r}\), and the unit vector normal to the surface at the point of intersection, \(\hat{n}\).

\begin{align}

\vec{t} &= 2(\hat{n}\cdot\vec{r})\hat{n} - \vec{r}\\

\end{align}

**Diffuse reflection**

To implement diffuse reflections we will use cosine-weighted sampling. More information on cosine-weighted sampling can be found here. Below \(u_1\) and \(u_2\) are uniform random variables. Ultimately, we will reorient the resultant vector based on the surface normal (we are sampling from the unit hemisphere defined by the surface normal at the point of intersection).

\begin{align}

u_1 &\sim U(0,1)\\

u_2 &\sim U(0,1)\\

r &= \sqrt{1-u_1}\\

\theta &= 2\pi u_2\\

\vec{v} &=

\begin{pmatrix}

r \cos(\theta)\\

r \sin(\theta)\\

\sqrt{u_1}\\

\end{pmatrix}

\end{align}

**Refaction**

Refraction gives the appearance of light traveling through a barrier, such as from air to glass. Below we have the equation for the transmission vector, \(\vec{t}\), based on Snell's equations. \(n_1\) and \(n_2\) are the indices of refraction of the two media. Obviously, this equation is only valid if the quantity under the radical is nonnegative. If this quantity is negative, we use the reflection equation above. Such a situation is known as total internal reflection. In our code we will intialize \(n_1\) and \(n_2\) by evaluating the inner product of the ray with the surface normal. If this product is less than zero, we will be entering the medium. We also flip the normal when exiting the medium. It should be relatively straight forward to add the Fresnel equations. Kevin Beason did so here.

\begin{align}

\vec{t} &= \frac{n_1}{n_2}\hat{r} - \left( \frac{n_1}{n_2} \hat{n}\cdot\hat{r} + \sqrt{1-\frac{n_1^2}{n_2^2} \left[1 - (\hat{n} \cdot \hat{r})^2 \right]} \right) \cdot \hat{n}\\

\end{align}

**A spice of interaction**

We have attempted to add some interaction to this project by including the keyboard handler available here. The premise behind this procedure is to reset the accumulated path values when the camera position or orientation changes. The path tracer begins to progressively refine the scene when the view remains static. Improvements to the project's efficiency would yield a better interactive experience.

**Some notes**

The larger the surface area of your light sources, the faster your scene will appear to converge (less noise), because the rays will hit a light source with greater probability. The project currently has a limit of 10 bounces. If you wish to exceed this limit, you must update the `sampleRay()`

function in `pathtracer.cu`

. Additionally, you will need to update the `Makefile`

to reference the proper location for the `libcudart.so`

and `libcurand.so`

libraries.

If you have any suggestions for improving this path tracer or questions about it, let me know.

Download this project: pathtracer.tar.bz2

Additional information:

]]>