What is ray tracing?

Question

I'm not a big tech person so the first time I've heard about it is in Cyberpunk 2077 discussions. Apparently only the really high end GPUs can support it properly.
What exactly is ray tracing? How will it change the graphics of a game? I'm wondering, because my GPU can't support it, if I'll miss out much.

Steve · Answer

The answer by SF is a very good answer, and deserves to be accepted.  There are a few questions it raises that I can answer, as I studied cutting-edge graphics for fun back in the 90s.  I'd do this in a comment if I had the reputation to do so (I signed up for this purpose).  Since I can't, I'll go into a bit more depth on 3D graphics in general to better illustrate why things are different when they seem the same on the surface.
A couple terms first:
I'm going to use the term shading to refer to non-raytraced graphics.  I'll explain why shortly.
Interpolating is taking two values some distance apart and smoothly filling in the gaps between them.  If the difference is 10 across 10 steps, you count by 1s, but if the difference is 20 over 10 steps you would count by 2s, and so on.
Rasterizing is the act of taking a picture and representing it on a grid.  All graphics displayed on modern monitors are raster graphics.  The reason we qualify with the name "raster" is because some old displays didn't use a grid, they drew lines.  Those were vector graphics displays and, rather than pixels, they drew line segments and curves as the basic unit of rendering.
A scene is a collection of things to draw in a 3D world.
Inside your scene are objects, which are made up of surfaces.  Each surface is a flat polygon.
The camera is the perspective that determines what is drawn.  Often people like to think of a camera as a single point, but it isn't; it is the same shape as your display. (In practice, the camera is treated as a frustum, which is a truncated rectangular-based pyramid.  Everything inside it gets drawn--this is a shortcut for practical purposes but the ideal would be infinite depth.)
If you divide the camera into a grid the same size as the resolution you are rendering, each cell is a single pixel.  To render a scene, you need to determine what color to make each pixel.  The difference between shading and ray tracing is in how we determine that.
As long as this post is, it would be many times larger if I went into all the detailed math and history of optimizations.  I'm going to cut some corners, so please, if you want to use this knowledge for research or to win Internet arguments, please do thorough research first, as it is my goal to be right enough to convey the idea without getting bogged down in details.
Early ray tracing was based on the idea that you trace a ray from the focal point through each pixel.  You then draw whatever that pixel hits.  This has a problem, though, and that is that the color at that point depends on the light hitting it.  For a simple model, with a single global light source, this is easy, but light bounces off all the surfaces.  Objects cast shadows, some surfaces reflect light, and to figure out what color each pixel is, you actually have to figure out where each of the corners hit and make a new camera there.  You trace more rays through the new camera and average the colors together to determine what color your pixel should be.  But each of those rays has the same problem, and we have to keep nesting this forever.  In practice by reducing the resolution of each successive surface we eventually reach an end, but it is still complicated and very processor intensive.  Not viable for video games.  So shortcuts were created for faster graphics.  These shortcuts started with "good enough" and just got better over time.  The shortcuts were so much more usable that ray tracing fell out of fashion almost completely for several years.
Every ray we trace requires us to test each surface to see where it hits.  That is a lot of comparisons when you consider a 4k screen has roughly 8 million pixels and a normal scene has tens of thousands of surfaces.  Instead, we can work backwards--we can draw each surface once and figure out which ray would hit it.  A little bit of math makes that computation trivial.  With the different shading methods we look at each surface and draw it on the screen.  Each surface is a polygon, and we can figure out where on the screen each vertex of that polygon is located.  Interpolating between the locations of the vertices we can find all the pixels that correspond to that surface.  For each pixel we can determine where a ray through that pixel would strike the surface.
That point on the surface has a color.  This can be because the entire surface is a single color, because each vertex has a color and you figured out how far from each vertex the point is and computed a weighted average, or because the surface is textured and you looked up texture coordinates in an image to pick a color.  This color, however it is determined, is the diffuse value, which can be thought of as "the color a thing is."  The next most important information to determine what color to make the pixel is the amount of light that is shining on that surface.  Modern shaders get really complicated in this part, adding more and more parts to determine various parameters, but the basic idea is the same: you have figured out what each pixel is looking at, then you determine its color.
In addition to the diffuse value, we need to know how much light is hitting the surface.  To figure out that we need to know what way the surface is facing.  We call this the normal vector and each shading model uses a different method to identify normal vectors and turn them into lighting values.
Flat shading has a single normal vector for each surface.  We use the angle between the light source and the surface normal to determine the amount of light to apply.  This means that every light hits every point on the surface equally, so the entire surface has a single uniform brightness.  It doesn't look very good, but it is fast to compute.
Gouraud shading uses a separate normal vector for each vertex in a surface.  After computing the lighting at each vertex, you can quickly interpolate the lighting value across the surface.  This was used a lot in the late 80s and early 90s and usually looks really smooth and glossy, like a polished plastic.
Phong shading computes the normal vector at each vertex just as in gouraud shading, but instead of interpolating the colors we interpolate the normal vectors, and compute the lighting for each pixel individually.  An evolution on this model is called normal mapping, in which a texture stores the normal vector for each point on a surface, allowing for very high detail.  Generally this is considered to be a special case of phong shading, because the idea of per-pixel normals is the defining characteristic.
All of that is a crash course in the history of 3D graphics.  It is, technically, wrong in a lot of areas, but it should give you an appreciation for the basic shape that things took.  The important takeaway is that ray tracing started as the idealized model of how to draw 3D graphics but was too difficult to do, so shading was introduced as a shortcut that produced adequate results.  We have been refining that process for several decades.
And this brings us back to modern ray tracing.  I am very much not an expert on current techniques (I have a passing academic interest), so take this with a hefty dose of salt; I could be very wrong on the details here.
There are a lot of problems with the basic ray tracing algorithm that simply cannot be solved.  We have to trace rays through the points between the pixels and then determine what is inside each pixel.  We can't just trace rays through each point because they can hit different surfaces, possibly different objects (or possibly one never hits anything at all).  We can't create a camera to determine what each pixel should see.  So we use shortcuts.  Shading models look at the surfaces of objects, whereas ray tracing looks at lights.  For each light you can figure out how strongly it shines on each surface.  The color you see is the light reflected by the surface towards the camera.  But some light will shine off in other directions, and that light will illuminate other surfaces, some of which will reflect light towards the camera and some light towards still other surfaces.  The important fact here is that light can only diminish as you trace it--eventually it is so dim you can ignore it.  The light that bounces off each surface towards the camera shines through some number of pixels of the camera, and each pixel accumulates light values until no more lights have to be computed.
From the perspective of someone playing a video game, there are two big differences:
Ray tracing allows for detailed reflections and refractions, including on complex surfaces.  To have a mirror in shading you generally create a camera where the mirror is, render it to a texture, then draw that texture.  This doesn't work well on complex surfaces and two overlapping mirrors are very difficult to manage.  As a result, game designers tend to avoid situations where this would have to be done.  There are solutions, but every solution has a different set of tradeoffs, and the simplest one is to design content that avoids the problem altogether.
Ray tracing allows multiple surfaces to be "visible" through a pixel.  The basic process of drawing with shaders means each pixel represents the light bouncing off a single surface.  This means that objects have sharp edges.  The technical term for this is aliasing which means we're drawing a low-quality version of a high-quality image.  There are a lot of techniques developed over the years to combat this effect, most notably under the term "anti-aliasing", but supersampling has become popular in recent years.  Antialiasing is an entire field of research.  You can think of it as blurring the edges of objects, but that ranges from inaccurate to flat-out wrong when you get into the details.  Supersampling is comparatively simple: use a camera larger than your screen, then shrink the image to fit.  If you render at twice the width and twice the height you'll blur together 4 pixels the camera rendered to make one pixel on the screen.  Ray tracing avoids this by figuring out how much light from every surface travels through a pixel, so there is no aliasing in the first place.
So, with all of that said, there are differences in what is rendered, but how much of a difference will this make for playing games?
In theory, not much at all.  Decades of research and development in 3D graphics has led to a large number of workaround, hacks, and optimizations.  Graphics have gotten really good without needing ray tracing.  As part of this progression development studios have toolchains designed to work with shaders and produce amazing results with them.  There is a strong reason to continue supporting shaders as they are well established both in development workflows and consumer hardware.  They would shoot themselves in the foot to abandon shaders completely.  Over time, more and more effort is likely to be put into the ray traced assets than the shaded ones, and that will follow the adoption of the hardware.  Console support for ray tracing is likely to be the largest catalyst for that movement.  All you're missing out on is not having the highest quality graphics, but that isn't much different than using a monitor that only supports 8-bit color channels, playing at 1080p instead of 4k, at 30fps instead of 60, or with any graphics setting below maximum.  If you, like most people, don't mind not having the absolute best, then you won't really be missing anything.
In practice, however, it will probably be a little more than that, but only as a novelty.  When you put a new toy in the hands of artists and engineers, they will play with it.  They will experiment and suss out its capabilities and learn how to work with this new material.  I fully expect that there will be visual easter eggs only visible with ray tracing--perhaps a room that, when reflected on the side of a teapot, looks like Groucho Marx.  These sorts of easter eggs will be primarily novelty and, once discovered, the images will be all over the Internet, so you won't really be missing out.

adrian · Answer

TLDR: Raytracing is a way of achieving highly realistic graphics. However, it's currently slower than traditional methods, though that will change in the near future since more and more graphics cards have hardware to speed up raytracing.
Before Raytracing
For many years, the preferred way of generating computer graphics in games has been rasterization. In this method, the program takes 3D data (points, polygons, etc.), transforms it into 2D space, and fills in (rasterizes) the on-screen polygons. This process is pretty performant and easy to accelerate using hardware, so it's been the method of choice for many years.
However, this method has some pitfalls; namely, it's not very good at generating realistic graphics. To achieve good-looking results using rasterization, you need to use a variety of tricks (some would call them "hacks") as well as considerable effort on the part of the artists. Some areas it struggles with are:

Realistic reflection and refraction
Global illumination/indirect illumination (areas of a scene that aren't directly lit aren't pitch black)
Dispersive effects (prisms)

among others.
So what is raytracing?
There is another way of generating 3D graphics, and it's called raytracing. To be accurate, raytracing is actually a family of methods, but at their core they function similarly. Instead of converting 3D primitives to 2D polygons, a raytracer shoots rays from the camera out into the scene and shades the pixel based on the intersection.
Here's an illustration that explains the process:

This image was created by Henrik and was originally uploaded on WikiMedia Commons.
How the pixels are shaded will affect the output quality. One subset of raytracing, pathtracing, combines raytracing with some mathematics to generate highly realistic (in fact, photorealistic) graphics with minimal complexity. In fact, a decent programmer can create a basic pathtracer in just a weekend. I made a small pathtracer recently.
The caveat is that tracing rays is horrendously slow. In an unoptimized raytracer, for each pixel on the screen you need to test intersections against every polygon in the scene several times. However, the improved output quality means that raytracing-based methods have been preferred when it comes to offline rendering (think animated films) for a long time.
Real-time raytracing
Many newer graphics cards (think NVIDIA's RTX line) have hardware units that accelerate raytracing by performing certain RT-related calculations quickly using dedicated silicon. This means that we may see more and more games utilizing raytracing-based techniques to enhance or even create their game's visuals.

Ian Kemp · Answer

The current predominant method for rendering 3D graphics is called rasterisation. It's a relatively imprecise way of rendering 3D, but it's extremely fast compared to all other methods of 3D rendering. This speed is what enabled 3D graphics to come to consumer PCs when they did, considering the capabilities (or lack of) of hardware at the time.
But one of the tradeoffs of that speed is that rasterisation is pretty dumb. It doesn't have any concept of things like shadows or reflections, so a simulation of how these should behave have to be manually programmed into a rasterisation engine. And depending on how they are programmed, these simulations may fail - this is why you sometimes see artifacts like lights shining through walls in games.
Essentially, rasterisation today is bunch of hacks, built on top of hacks, built on top of even more hacks, to make 3D scenes look realistic. Even at its best, it's never going to be perfect.
Ray-tracing takes a completely different approach by modelling how light behaves in relation to objects in a 3D environment. Essentially it creates rays of light from a source or sources, then traces these rays' path through the environment. If the rays hit any objects along the way, they may change its appearance, or be reflected, or...
The upshot of ray-tracing is that it essentially models how light behaves in the real world, which results in far more realistic shadows and reflections. The downside is that it is far more computationally expensive, and therefore much slower, than rasterisation (the more rays you have, the better the scene looks, but also the more rays you have, the slower it renders). Slow enough in fact, that ray-traced graphics have been unplayable on even the fastest hardware.
Until recently, therefore, there was no reason for games engines to provide anything other than the ability to render via rasterisation. But in 2018 NVIDIA added special hardware (so-called RTX) to its Turing-series graphics cards that allows ray-tracing computation to be performed far faster than it has been until now. This has allowed games companies to start building ray-tracing capabilities into their game engines, in order to take advantage of these hardware features to generate game worlds that appear more realistic than rasterisation would allow.
Since rasterisation has been around for so long, and since mainstream adoption of ray-tracing is still in early days, you are unlikely to see much difference between Cyberpunk's rasterised vs ray-traced graphics. In the years to come though, ray tracing will become the new standard for rendering 3D graphics.
Technically, any graphics card can render ray-traced graphics, but most lack the hardware that will allow them to render those graphics at a decent frame rate.
Before anyone tears me apart for this unscientific overview of how rasterisation and ray-tracing work, please understand that my explanation is written for the layman.

Polygnome · Answer

Traditionally, home computer games have employed a technique called rasterization. In rasterization, objects are described as meshes, composed of polygons which are either quads (4 vertices) or tris (3 vertices). Nowadays, its almost exclusively tris. You can attach additional information to this - what texture to use, what color to use, whats the normal etc.

The model, view and projection matrices are three separate matrices. Model maps from an object's local coordinate space into world space, view from world space to camera space, projection from camera to screen.
If you compose all three, you can use the one result to map all the way from object space to screen space, making you able to work out what you need to pass on to the next stage of a programmable pipeline from the incoming vertex positions.
(Source: The purpose of Model View Projection Matrix)

This is a very simple model, however, and you need to take special care for all sorts of things. For example, you need to somehow sort the polygons, first, and render them back-to-front. Because you simply transform polygons, and first rendering a close polygon and then rendering a far polygon might just overwrite the closer polygon. You have no shadows. If you want shadows, you need to render a shadow map, first. You have no reflection, no refraction, and transparency is hard to get right. There is no ambient occlusion. These things are all costly tricks that are patched upon this model and are smoke and mirror to get realistic looking results.
Up until recently, this technique was the only technique fast enough to convert a 3D scene to a 2D image for display in home computer games, which need at least about 30 frames per second to not appear stuttering.
Ray tracing on the other hand, in its original form, is extremely simple (and consequently, dates back to the 16th century and was first described for computers in 1969 by Arthur Appel). You shoot a ray through every pixel of your screen. And record the closest collision of the ray with a polygon. And then color the pixel according to whatever color you find at that polygon. This can again be from a shader, e.g. a texture, or color.
Reflection is conceptually extremely simple. Your ray has hit a reflective surface? Well, simply shoot a new ray from the point of reflection. Since angle of incidence is the same incoming and outgoing, this is trivial.
Refraction, which is incredibly hard with rasterization is conceptually extremely simply with ray tracing - just emit a new ray, rotated by the angle of refraction of the material, or multiple rays for scattering. A lot of physical concepts are very, very easy to describe with ray tracing.
Shadows are trivial. If your ray hits a polygon, just shoot rays to every light source. If a light source is visible, the area is lit, otherwise its dark.
This conceptual simplicity comes at a cost though, and that is performance. Ray tracing is a brute force approach to simulate light rays in a physical way, and re-creating the physical behavior of light, as well as conservation laws, especially conservation of energy, is much easier with ray tracing than with rasterization.
This means physically accurate images are much easier to achieve with ray tracing. However, this comes at a tremendous cost:
You simply shoot rays. Lots of rays. And each time light reflects, refracts, scatters, bounces or whatnot you again shoot lots of rays. This costs a tremendous amount of computing power and hasn't been within the grasp of general-purpose computing hardware in the past.
Path tracing is a technique that has revolutionized ray tracing, and most ray tracing today actually uses path tracing. In path tracing, multiple similar rays are combined into bundles which are evaluated at the same time. Path tracing, combined with bidirectional ray tracing (introduced in 1994) where rays are shot through the scene from the light source, has sped up ray tracing significantly.
Nowadays, you simultaneously shoot rays (or bundles of rays) from the camera and the light sources, reducing the amount of rays shot and allowing more guided tracing of paths.
Implementing a simple ray tracer with reflection, refraction, scattering and shadows is actually quite simple, it can be done over a weekend (been there, done that). Don't expect it to have any reasonable performance, though. Implementing the same from scratch as a rasterization technique (roll your own OpenGL) is much harder.
Further reading:

Christensen et al. RenderMan: An Advanced Path Tracing Architecture for Movie
Rendering

Brian Caulfield. What’s the Difference Between Ray Tracing and Rasterization?

SF. · Answer

In normal rendering, you have light sources, solid surfaces that are being lit, and ambient light. The brightness of a surface is calculated depending on distance and angle relative to the light source, possibly colored tint of the light added, the entire brightness adjusted by ambient (omnipresent) light level, then maybe some other effects added, or effects of other light sources calculated if they are present, but at that point the history of interaction between this light source and this surface ends.
In raytracing, the light doesn't end at lighting up a surface. It can reflect. If the surface is shiny, the reflected light can light up a different surface; or if the 'material property' says the surface is matte, the light will dissipate and act a bit as ambient light of quickly decreasing level in the vicinity. Or if the surface is partially transparent, the beam can continue, acquiring properties of the 'material', changing color, losing intensity, becoming partially diffuse etc. It can even refract into rainbow and bend on a lens,  And in the end the light that ends up 'radiated away' has no impact, only what reaches the 'camera'.
This results in a much more realistic and often vibrant scene.

The technology has been in use for a long, long time, but it was always delegated to the CPU, and rendering of a single still image using raytracing would sometimes take days. It's a new development that graphics cards got good enough that they can perform this in real time, rendering frames of animation of game as they happen.
Generating an image like attached below, in 1989 would take a couple days. Currently on a good graphics card it takes less than 1/60th of a second.

What is ray tracing?

5 Answers

Before Raytracing

So what is raytracing?

Real-time raytracing

Add your own answers!

Ask a Question