Ray Tracing

box image

In Chapter 8 of Computational Geometry: Algorithms and Applications, we are presented with the problem of how to render a 3-dimensional image consisting of several objects, a light source, and a view point on the screen. For example, in order to render the image of a pyramid behind a cube one would need to determine at what point each object is visible in order to present an accurate portrayal of the scene, as well as pinpoint a light source being reflected off each object in a way that would make the image seem tangible and realistic. A typical screen might consist of 1280 x 1024 pixels. Thus in order to present the image accurately and realistically onto the screen, one would need the following information:

1. how much of the object/s is visible for each pixel

2. the intensity of light being emitted

Here we will concentrate on the first issue, determining the visibility an object for each pixel.

As an example, let's imagine a scene we would like to render consisting of a red pyramid partially hidden behind a blue cube, both of which are surrounded completely by a white background. We'll call this image the Pyr-Cube. How would we render this scene onto the screen? Let's remember that the screen consists of a series of pixels which are actually very small, discrete squares. If we wanted to portray the image of a square, we could, for example, simply allocate a space of 200 x 200 pixels. Thus we can first attack the Pyr-Cube problem by utilizing a "hit or miss" strategy by asking the following question: in which pixels will certain objects be visible and in which pixels will those objects not be? We can shoot a ray through the center of the pixel so that the first object hit by the ray is that one that is visible within that pixel. This method is a loose version of what many computer animation and graphic artists use, called ray tracing.

However, this leads us to another problem. Since each pixel is in actuality a small square (not a point), when we shoot a ray through the center of it we run the risk of missing the edge of an object that is actually visible within less than half of the pixel's square area. Thus that entire pixel would be considered a "miss" and would not hold any visibility of the object even though there is actually an edge of the object within that pixel. Or else the opposite could happen. In sum, if the square is less than half full, a ray trace through the center of a pixel would result in simply a blank pixel. However if the square is more than half full (though not entirely) a ray trace through the center would result in a completely filled pixel, which would also be inaccurate. Such problems would result in "jaggies" throughout the image, where the edge of our cube that we would like to render would be a jagged line instead of the straight one that we would prefer.

Instead of simply having two "hit or miss" categories, another alternative would be to have the "percentage hit", e.g. "49% hit" which would be .49 times the object intensity. Thus we could shoot more than one ray through the pixel. If we shot 100 rays through each pixel, 35 hits would be 35% of the pixel. This is called supersampling. We could in theory implement this by separating a pixel into graph of 10x10 smaller squares and shooting one ray through the center each of those squares.

But this leads us to yet another problem!

Even though the error in rendering the image would be small, there would be a certain regularity in error across rows and columns of pixels. This regularity in the errors triggers the human visual system and causes some annoyance. It would be better to distribute the points throughout each pixel randomly, in order to minimize regularity in error. So we must decide whether a set of random sample points is acceptable. To do so we must make sure that the number of hits is close to or representative of the percentage of covered area. For example, if we had the majority rays clustered in the lower left-hand corner of a pixel diminishing outward, the number of hits might be zero even though there might be object visibility in the upper right-hand of the pixel corner. Thus this would not be an acceptable or "good" set of random points. Thus we must determine whether a set of random sample points is good in order to decide whether we want to use them or not.

But how do we do that???

In order to determine whether a set is good we want to find the diffference between the percentage of hits for an object and the percentage of the pixel area where that object is visbiel is small, which is called the discrepancy of the sample set with respect to the object. However, since we don't know in advance which objects will be visible in the pixel, we should prepare ourselves for the worst-case scenario. So we want the maximum discrepancy over all possible ways that an object can be visible inside the pixel to be small, which is called the discrepancy of the sample set. This depends on the type of objects that are in the scene.

So if the discrepancy is low, we keep it. If not, we should generate a new random set. To do this we need an algorithm that computes the discrepancy of a given point set.


Links: