There is a lot of confusion and cargo cult stories surrounding ray tracing, here I will try to explain what is going on.
TL;DR DXR/RTX is not really "ray tracing"
What is "ray tracing"?
The classic definition is coming from a 1979 paper "An improved illumination model for shaded display". By Whitted et al.
Ironically it's also called "BRT" or "backward ray tracing" in literature, because in reality light travels from light source into the eye.
Is it photorealistic?
No. The real light equation is too complex and undecidable even for simple cases of multiple planes in space. Not to mention complex objects.
But it can be pretty good looking for certain effects. For example reflections and shadows.
If we add a correct refraction for transparent surfaces things get pretty complicated.
We then need to cast at least 3 rays per each intersection: reflection ray, shadow ray, refraction ray. And do it for any other intersection later on.
It multiplies so quickly that more than one bounce cannot be traced in a sane time.
Over the years people tried to solve the abysmal speed of a classic ray tracer by using more clever methods:
Path tracing.
Instead of tracing a 3 new rays in each bounce. Trace one random ray.
Really, just shoot a ray in a random direction. It works because if you shoot enough rays and your random distribution is uniform enough the whole scene will be covered in random rays (paths) that can be averaged to fill in the "blank" spaces.
Of course it doesn't work in all cases now: what If we have a mirror? We cannot shoot in random, mirror reflects in one direction. What if the medium is not entirely opaque (subsurface scattering, caustics)? We cannot trace things randomly either here.
So, other hacks are incoming...
Photon mapping.
We trace paths from both eye and lights.. First we trace lights and when light intersects with a scene a special structure "photon map" is built to "cache" the results.
Then rays from "eye" are traced and when the intersection is found the photon map is used to sample from.
The photon map itself is a special structure called "k-d tree" that can accurately find a "photon" that's closest to a particular point in space (the intersection with an "eye" ray that we found).
Photon mapping can produce a lot of realistic effects if the map is detailed enough.
And it can also use an hierarchy of maps to make specific effects, that need insanely detailed maps (like caustics), local and small.
Metropolis light transport.
This algorithm tries to reduce the "randomness" of the rays, by using a specific fitting function: Metropolis-Hastings algorithm.
It's all a black magic of statistics, but the idea is simple: use what we know about specific materials in the ray intersection points to cast the random rays in not so random but the most probable directions.
It also reduces probability of rays cast into the "void" - places where are no objects in the scene.
There are a lot more hacks, and some of them are pretty new, like VCM (vertex connection and merging, paper published in 2012) where it's essentially a big bag of small hacks on how to cast rays more efficiently and then merge the colors.
Now the main thing why you opened this one: what about DXR, RTX and the next gen consoles?
Can we use all of the above? Is it all hardware accelerated? When will we get to photorealistic games?
The last question is easy to answer: you can get to photorealistic any time: just pre-bake everything and boom!
That's what happened in the latest NV Marbles demo: a lot of pre-baked assets.
As for the other questions: it's hard to answer.
If we go by what's available in DXR the future is pretty grim. They offer a very simplistic BVH (they call it AS: acceleration structure) it's a two level tree with vertices exclusively on the second level.
You cannot build a photon map with that, or any other more efficient BVH for a particular scene/game.
DXR should actually be called "programmable rasterization" or "rasterization shaders" because that's what it is. But because it has that fixed path hardware accelerated search in AS they called it "ray-tracing".
In reality in DXR compute shaders are used everywhere: ray cast, closest hit, ray miss, intersection output, any hit. The real hardware accelerated path is only the ray search, everything before and after is done by regular compute shaders.
More than that, the actual result cannot be even written to a render target and needs to be copied from a compute shader output (they could not bypass ROPs, it seems).
Is it really that bad?
No. It's not. It's a bad game from a marketing point of view. RT is slow and will remain slow.
But if we really looking at it like "raster shaders" - it's awesome!
You can use all of the tricks I've described above in moderation in specific places where it's needed.
You can even use them in screen space!
Photon mapping caustics in screen space? I'll take two!
Off-screen reflections in mirrors? Bring it on!
Path trace only the closest objects? Yeah!
Soft shadows? Take my money!
TL;DR DXR/RTX is not really "ray tracing"
What is "ray tracing"?
The classic definition is coming from a 1979 paper "An improved illumination model for shaded display". By Whitted et al.
Thus the classic ray tracing is an iterative process where an unspecified amount of rays are traced from camera, through the scene, creating a new ray at each bounce, until they hit the light source.In a simplified form, [a global illumination information that affects the intensity of each pixel] is stored in a tree of “rays” extending from the viewer to the first surface encountered and from there to other surfaces and to the light sources.
Ironically it's also called "BRT" or "backward ray tracing" in literature, because in reality light travels from light source into the eye.
Is it photorealistic?
No. The real light equation is too complex and undecidable even for simple cases of multiple planes in space. Not to mention complex objects.
But it can be pretty good looking for certain effects. For example reflections and shadows.
If we add a correct refraction for transparent surfaces things get pretty complicated.
We then need to cast at least 3 rays per each intersection: reflection ray, shadow ray, refraction ray. And do it for any other intersection later on.
It multiplies so quickly that more than one bounce cannot be traced in a sane time.
Over the years people tried to solve the abysmal speed of a classic ray tracer by using more clever methods:
Path tracing.
Instead of tracing a 3 new rays in each bounce. Trace one random ray.
Really, just shoot a ray in a random direction. It works because if you shoot enough rays and your random distribution is uniform enough the whole scene will be covered in random rays (paths) that can be averaged to fill in the "blank" spaces.
Of course it doesn't work in all cases now: what If we have a mirror? We cannot shoot in random, mirror reflects in one direction. What if the medium is not entirely opaque (subsurface scattering, caustics)? We cannot trace things randomly either here.
So, other hacks are incoming...
Photon mapping.
We trace paths from both eye and lights.. First we trace lights and when light intersects with a scene a special structure "photon map" is built to "cache" the results.
Then rays from "eye" are traced and when the intersection is found the photon map is used to sample from.
The photon map itself is a special structure called "k-d tree" that can accurately find a "photon" that's closest to a particular point in space (the intersection with an "eye" ray that we found).
Photon mapping can produce a lot of realistic effects if the map is detailed enough.
And it can also use an hierarchy of maps to make specific effects, that need insanely detailed maps (like caustics), local and small.
Metropolis light transport.
This algorithm tries to reduce the "randomness" of the rays, by using a specific fitting function: Metropolis-Hastings algorithm.
It's all a black magic of statistics, but the idea is simple: use what we know about specific materials in the ray intersection points to cast the random rays in not so random but the most probable directions.
It also reduces probability of rays cast into the "void" - places where are no objects in the scene.
There are a lot more hacks, and some of them are pretty new, like VCM (vertex connection and merging, paper published in 2012) where it's essentially a big bag of small hacks on how to cast rays more efficiently and then merge the colors.
Now the main thing why you opened this one: what about DXR, RTX and the next gen consoles?
Can we use all of the above? Is it all hardware accelerated? When will we get to photorealistic games?
The last question is easy to answer: you can get to photorealistic any time: just pre-bake everything and boom!
That's what happened in the latest NV Marbles demo: a lot of pre-baked assets.
As for the other questions: it's hard to answer.
If we go by what's available in DXR the future is pretty grim. They offer a very simplistic BVH (they call it AS: acceleration structure) it's a two level tree with vertices exclusively on the second level.
You cannot build a photon map with that, or any other more efficient BVH for a particular scene/game.
DXR should actually be called "programmable rasterization" or "rasterization shaders" because that's what it is. But because it has that fixed path hardware accelerated search in AS they called it "ray-tracing".
In reality in DXR compute shaders are used everywhere: ray cast, closest hit, ray miss, intersection output, any hit. The real hardware accelerated path is only the ray search, everything before and after is done by regular compute shaders.
More than that, the actual result cannot be even written to a render target and needs to be copied from a compute shader output (they could not bypass ROPs, it seems).
Is it really that bad?
No. It's not. It's a bad game from a marketing point of view. RT is slow and will remain slow.
But if we really looking at it like "raster shaders" - it's awesome!
You can use all of the tricks I've described above in moderation in specific places where it's needed.
You can even use them in screen space!
Photon mapping caustics in screen space? I'll take two!
Off-screen reflections in mirrors? Bring it on!
Path trace only the closest objects? Yeah!
Soft shadows? Take my money!