• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

Inside the AI-Powered Rendering Tech Polyphony Digital Is Building for GT's Future [GTP]

Skifi28

Member

Polyphony Digital is developing a new rendering system for Gran Turismo that uses neural networks to determine which objects in a scene need to be drawn, and the early results suggest it could meaningfully improve performance on PlayStation 5. The system, called "NeuralPVS", was detailed in a technical presentation at the Computer Entertainment Developers Conference (CEDEC) last year. The talk was given by two Polyphony graphics engineers: Yu Chengzhong and Hajime Uchimura.

How Rendering Works Now


Every frame Gran Turismo renders contains thousands of objects: buildings, trees, grandstands, barriers, track surfaces, and everything else that makes up a course environment. But, at any given moment, only a fraction of those objects are actually visible to the player. Some are behind the camera, some are off to the side, and some are hidden behind other objects in the scene.

Drawing all of those invisible objects would be a waste of processing power. So the game uses a process called "culling" to figure out which objects can be skipped. The better the culling, the less work the CPU and GPU have to do, which means more stable frame rates and potentially more room for visual detail.

Gran Turismo 7 currently uses a precomputed culling system. Before a course ships, Polyphony's tools render the track from thousands of camera positions along the driving surface, recording which objects are visible from each spot. Those results are stored as visibility lists (internally referred to as "vision lists") that the game looks up at runtime.


To keep the data manageable, the system clusters those thousands of sample points into a smaller set of zones using a mathematical technique called Voronoi partitioning. At runtime, the game figures out which zone the camera is in and uses that zone's visibility list to decide what to draw.

Where The Current System Falls Short


This clustering approach works, but it has some inherent limitations.


The boundaries between zones are hard lines, which means visibility can only change in abrupt, discontinuous jumps as the camera crosses from one zone to the next. Those boundaries don't always line up neatly with the actual geometry of the course, either, which can lead to objects popping in or out at moments that don't look natural.

Neural Networks to the Rescue!


NeuralPVS replaces that zone-based lookup with a neural network that learns the relationship between a camera position and which objects should be visible. Instead of snapping to the nearest precomputed zone, the network takes the camera's exact coordinates and outputs a visibility prediction for every object in the scene.


The result is a smooth, continuous visibility field rather than a patchwork of discrete zones. Objects transition in and out of visibility gradually as the camera moves, rather than flipping on and off at arbitrary zone boundaries.

BPIqkV4WhdrWeF7v.png


PS5 Benchmarks


The presentation included benchmark data from PlayStation 5 running on two courses: Eiger Nordwand and Grand Valley.



On Eiger Nordwand, average CPU frame time dropped from 3.944ms to 3.758ms with NeuralPVS enabled. GPU improvements were smaller (averaging 0.026ms), which makes sense given that Eiger is a course with relatively few occluders. The gains come from the network learning to use terrain features that the existing system overlooks.


Grand Valley showed more dramatic results. CPU average dropped from 4.552ms to 4.256ms, and the CPU maximum dropped from 6.378ms to 5.849ms, a reduction of over half a millisecond. GPU load was also more stable across the course, with the maximum GPU time dropping by nearly 0.1ms.


I think this is a pretty interesting solution to an age-old problem and the gains seem to be there. On a 20km track like the ring, I imagine they would be even greater. As far as I'm concerned this is how AI should be used in games, to provide engineering solutions to things that can't be easily solved by hand, not replace artists with slop. I wonder if it'll be deployed on GT7 or we'll have to wait for the next.
 
Last edited:
Every publisher really needs one or two powerhouses that pumps new ideas and incorporates them throughout the organization. It would be interesting to hear if Sony pushes this agenda on first party studios or protect it like a special sauce. Microsoft got Playground and iD
 
I think this is a pretty interesting solution to an age-old problem and the gains seem to be there.
Tbh not enough info to say if it actually is worthwhile. Back in the day we used cell partitioning as evolution of larger zones with each cell that was just few meters across having just one simple(128 bits) index of visibility so it was quite compact and super fast to evaluate in runtime, and cell size that kept continuity just fine.

The problem was always the precomputation costs, as that severely impacts iterations on the tracks(every change=recompute visibility for the track).
The described method has to retrain the model each time so the question is how much more or less expensive that is, but it still retains the same weakness of original approach.
 
Tbh not enough info to say if it actually is worthwhile. Back in the day we used cell partitioning as evolution of larger zones with each cell that was just few meters across having just one simple(128 bits) index of visibility so it was quite compact and super fast to evaluate in runtime, and cell size that kept continuity just fine.

The problem was always the precomputation costs, as that severely impacts iterations on the tracks(every change=recompute visibility for the track).
The described method has to retrain the model each time so the question is how much more or less expensive that is, but it still retains the same weakness of original approach.
I don't believe they touched on that, the article and presentation it's based on seem to be focused on the better results + lower real time computational cost. I don't see why it wouldn't be worth it unless the iteration cost was much higher with this solution which I doubt as they wouldn't be presenting it like this. Perhaps retrofitting it to GT7 might be too much trouble for a 4 year old game, but it could well be worth it for GT8 if the content is built with the new tech in mind from the start.
 
Last edited:
The described method has to retrain the model each time so the question is how much more or less expensive that is, but it still retains the same weakness of original approach.

It's a small model probably trained per circuit. The highest cost probably is the R&D of the model and training itself. If the model is precise, has to be faster than doing manually,
 
It's a small model probably trained per circuit.
That's not the point/question - the main weakness of precomputed visibility is that 'every change = wait times' so you negatively impact your iterations.
The process was already entirely automated, but when it's measured in days or hours, or double digit minutes - it's already a problem. And that's what it typically was in the 00s - we would run nightly builds to precompute visibility across tracks (and so did everyone else back then).
Most model training times are decidedly NOT measured in minutes or seconds.

If the model is precise, has to be faster than doing manually
It hasn't been manual for 20+ years. Even in context of GT specifically (not games as a whole) - I think only GT3 involved manual steps.

I don't believe they touched on that, the article and presentation it's based on seem to be focused on the better results + lower real time computational cost.
Yea which is my point - the realtime costs are already trivial since early-mid 00s with this approach, so there's really nothing to optimize for static geometry. The main open ended topics in visibility have been dynamic objects (which this obviously isn't about) and minimizing/eliminating precompute alltogether.

I don't see why it wouldn't be worth it unless the iteration cost was much higher with this solution
Even if it's the same I question the value - precompute step in of itself stands in the way of iterations. It's why industry has been so obsessed with realtime GI and other similar things - let's be honest, temporal instabilities in most of modern lighting pipelines are way, WAY worse than where we were 15 years ago, but the tradeoff is to get massive gains on scale and iteration speeds. Even the 'pathtracing' holy grails still suffer from what I can best describe as 'gelatinous' behaviour in realtime, the days of true-realtime lighting changes have been abandoned entirely this generation.

As to why to this specific technique - over 3 decades I've seen way more often than not engineering solutions that were all about 'rule of cool' rather than any particular outcome improvements. Software engineers love to come up with interesting solutions above all - it's not always as pragmatic as one would like to believe. And AI model mapping visibility is - well just really cool.
 
As far as I'm concerned this is how AI should be used in games, to provide engineering solutions to things that can't be easily solved by hand, not replace artists with slop. I wonder if it'll be deployed on GT7 or we'll have to wait for the next.
Well said, companies hurrying to do everything with AI are in for disappointment, nobody wants it.
 
Every publisher really needs one or two powerhouses that pumps new ideas and incorporates them throughout the organization. It would be interesting to hear if Sony pushes this agenda on first party studios or protect it like a special sauce. Microsoft got Playground and iD
That happens a lot more than you think. A lot of technology/best practices/etc are shared across first party studios.
 
The frametime gains seem very very minimal but if it reduces pop-in then I suppose that is good. Seems strange though, couldn't they just cull like most games do based on frustum/occlusion? Maybe I'm not understanding the aim of this very well.
 
Last edited:
PD remains one of the best devs around.
They are,also the ones using a laughing emoji needs to check themselves.PD before PS3 were true Rockstars booting up GT3 the first time mind melted my 14 y/o brain. I also played Sega GT.....😂😂 I loved Sega but I literally talked myself into thinking it was a better game than any GT game before it.
 
Last edited:
So what we are watching during gameplay is an illusion? A bubble with no life outside it?!

illusion hiding GIF
I don't know if this is a joke from your side but basically all games function like this otherwise you could not get them to fit in memory or run anywhere near a playable framerate.
 
The frametime gains seem very very minimal but if it reduces pop-in then I suppose that is good. Seems strange though, couldn't they just cull like most games do based on frustum? Maybe I'm not understanding the aim of this very well.
Frustum still uses zones, right?

Sounds like this skips grouping, and just knows what objects a camera should see. Instead of going from zone to zone as it switches camera, it add/removes objects based on what camera it's using.

Right now they render everything in a zone, this would let them render what a camera actually sees, and not just what's in the zone list for that camera.

If I'm understanding it correctly.
 
Yes, yes. This shit looks pretty. Pretty visuals aren't impressive anymore. Haven't been since Crysis in 2007.

GT 4 was (and still is) the most fun I'd had in a GT game. The Café stuff in GT 7 was finally a neat idea after the uninspired slop of the PS3 games and Sport. But they've fumbled the execution. Making the progression linear and making me hop between cars constantly instead of letting me "build a relationship" with a car was the wrong call, imho. And what's with the inflated prices of the cars? That's quite a poor attempt at stretching the game, making shit grindy as hell; it's like they've taken the wrong lessons from GT5 and just doubled down on them. What about better AI opponents? They are a fucking joke, so much so that I actually think that Polyphony have just given up. I'd rather they improve on those fronts than making shit even prettier.
 
Last edited:
Yes, yes. This shit looks pretty. Pretty visuals aren't impressive anymore. Haven't been since Crysis in 2007.

GT 4 was (and still is) the most fun I'd had in a GT game. The Café stuff in GT 7 was finally a neat idea after the uninspired slop of the PS3 games and Sport. But they've fumbled the execution. Making the progression linear and making me hop between cars constantly instead of letting me "build a relationship" with a car was the wrong call, imho. And what's with the inflated prices of the cars? That's quite a poor attempt at stretching the game, making shit grindy as hell; it's like they've taken the wrong lessons from GT5 and just doubled down on them. What about better AI opponents? They are a fucking joke, so much so that I actually think that Polyphony have just given up. I'd rather they improve on those fronts than making shit even prettier.
This doesn't make things prettier. Just using AI to make rendering more efficient. If this was implemented in GT7 tomorrow, for example, the fidelity of the things it renders would be exactly the same as today.
 
This doesn't make things prettier. Just using AI to make rendering more efficient. If this was implemented in GT7 tomorrow, for example, the fidelity of the things it renders would be exactly the same as today.
But performance would be more stable allowing less lod pop.
 
Given their past development timelines no way this launches on PS5…unless that 2029 rumor is true and Kepler is just blowing smoke because the impending Ai war will erase all traces of their digital lies!
 
That does seem like a really clever solution to like you said, an age old problem. I don't like AI, but yeah I don't have too much of a problem with it being used in this way. It's not that different from the AI denoising GPUs already do to make real time raytracing look good.

Also the article makes it sound like the technology is created by Polyphony and exclusive to them, which I approve of because that means it's not taking away work from programmers.
 
Last edited:
Would this technique also work for FPS games?

Knowing what objects are occluded by other objects is a problem that affects all 3D games.
The industry has developed a lot of algorithms to try to cull objects as early as possible, as not to waste rendering unnecessary things.
This seems like the next step to solving the problem.
 
Seriously several of you are non stop saying Microslop but when a Sony dev announces the use of Ai its "clever" ?

Lmao ok

Because Sony is trying to improve game's rendering. While Microsoft is just doing the opposite, it's just enshitification of it's services, products and games.
 
Seriously several of you are non stop saying Microslop but when a Sony dev announces the use of Ai its "clever" ?

Lmao ok
I think there's a pretty big difference between using AI to help with culling in rendering on a game and using AI to code your operating system and all the software for it, and then trying to force your users to use AI.
 
Last edited:
I think there's a pretty big difference between using AI to help with culling in rendering on a game and using AI to code your operating system and all the software for it.

Exactly. All we have to do is look at how bad Windows, Office and Teams have become in the last few years.
Microsoft products have always been lacking in quality, but now it's just a shitshow.
 
Seriously several of you are non stop saying Microslop but when a Sony dev announces the use of Ai its "clever" ?
There's two general AI strategies going on.
1) Human replacement(augmentation) which is what most of the AI model companies are pushing for - including Microsoft.
2) Product augmentation - which is what this thread is discussing for GT.

You can argue that there's 'slop' in both tracks - but only one of them has the sole objective to replace you (and the rest of us) in the productivity/economic loop.

Frustum still uses zones, right?
No - Frustum is the area of visibility in front of camera. But that's been used by every 3d game, ever since the first 3d vector titles were made in 1979. It doesn't matter what occlusion mechanism you use - BSP, zones/cells, occlusion queries etc. - you're 'clipping' those results against frustum(s) regardless.

Right now they render everything in a zone, this would let them render what a camera actually sees, and not just what's in the zone list for that camera.
If I'm understanding it correctly.
It lets them render what AI thinks the Camera actually sees - it's a minor distinction, but nonetheless, this will always disagree with ground truth to some extent - by definition. Machine model is still an approximation - but it's a tight fitting one so errors are less than approximating with zones.
 
Yes, yes. This shit looks pretty. Pretty visuals aren't impressive anymore. Haven't been since Crysis in 2007.

GT 4 was (and still is) the most fun I'd had in a GT game. The Café stuff in GT 7 was finally a neat idea after the uninspired slop of the PS3 games and Sport. But they've fumbled the execution. Making the progression linear and making me hop between cars constantly instead of letting me "build a relationship" with a car was the wrong call, imho. And what's with the inflated prices of the cars? That's quite a poor attempt at stretching the game, making shit grindy as hell; it's like they've taken the wrong lessons from GT5 and just doubled down on them. What about better AI opponents? They are a fucking joke, so much so that I actually think that Polyphony have just given up. I'd rather they improve on those fronts than making shit even prettier.
Are you living under a rock? They have Sophy AI in a lot of tracks now and Sophy 2.0 in the new campaign.
 
No - Frustum is the area of visibility in front of camera. But that's been used by every 3d game, ever since the first 3d vector titles were made in 1979. It doesn't matter what occlusion mechanism you use - BSP, zones/cells, occlusion queries etc. - you're 'clipping' those results against frustum(s) regardless.


It lets them render what AI thinks the Camera actually sees - it's a minor distinction, but nonetheless, this will always disagree with ground truth to some extent - by definition. Machine model is still an approximation - but it's a tight fitting one so errors are less than approximating with zones.
The way I understood it is, even with a 3d camera, you still need to store your geometry in zones, in order to be able to efficiently check for things like LOD, and also to help make culling more efficient. The problem with zones has always been that they're not very precise. As the camera moves, you get abrupt shifts in zones, zones being partially at one level of distance from the camera, and partially at another. Geometry that overlaps multiple zones... Stuff like that.
 
Seriously several of you are non stop saying Microslop but when a Sony dev announces the use of Ai its "clever" ?

Lmao ok
There are several very good use cases for AI in game dev, like in everything else. This is one of them. Microsoft also uses a lot of them.

The problem for Microsoft, is they're the #1 or #2 AI merchant. So a lot of what they put out is indeed slop. But Microsoft has a lot of really cool AI applications.

Not everything is console war.
 
PD once again demonstrates the actual useful way to use AI in game engines outside of upscaling.

First Sophy (even it it somewhat limited on current gen). Now AI culling, that is actually a pretty neat thing for racing games (and sims in general) with FAST asset streaming. Less pop in VR/2D is always a good thing, and this concept is more about efficency and less manhours that is in theory could significantly speed up the track creation pipeline.
 
The way I understood it is, even with a 3d camera, you still need to store your geometry in zones, in order to be able to efficiently check for things like LOD, and also to help make culling more efficient.
Again - this only applies to occlusion culling. All manner of frustum checks (including LOD) are done against a standard spatial structure like oct-tree, there's no zones involved.

The problem with zones has always been that they're not very precise. As the camera moves, you get abrupt shifts in zones, zones being partially at one level of distance from the camera, and partially at another. Geometry that overlaps multiple zones... Stuff like that.
LOD isn't really impacted by any of this - occlusion is, but I answered this in my first post. Dividing the (playable)world into evenly spaced visibility-cells that are sized in meters (single digit) was viable as far back as PS2 era, with mere 32MB of ram, and it essentially eliminated the problem of discontinuity between regions.
To be extra blunt - PS2 titles I worked on literally made this transition from regions to cells and associated lack of visible pop-in in the next titles is quite clear.

And if it's not obvious the method translated perfectly into HD era and beyond - I've been involved in titles shipping with static occlusion of this nature well into 2013. What did happen in the past decade is that more focus was being put on dynamic occlusion approaches - and I've also seen titles where we did a crude manual version of region-occlusions as late as 2017 - mainly because the 'cutting edge' engine in question had not suitable facilities for occlusion on that scale, but I digress.
Point being that the specific thing the article describes has long been a solved problem - precomputing and dynamic occlusion aren't - but what is described in the article doesn't solve those either.

It is - however a very cool way to solve the same problem again, but I also mentioned rule of cool before, we're still on the same page, hopefully I don't need to keep referring to the same points.
 
Last edited:
Seriously several of you are non stop saying Microslop but when a Sony dev announces the use of Ai its "clever" ?

Lmao ok
How have any MS studio used AI those last years? Any practical example in their games? Because Sony has already used AI in a few shipped games already.
 
Last edited:
Again - this only applies to occlusion culling. All manner of frustum checks (including LOD) are done against a standard spatial structure like oct-tree, there's no zones involved.


LOD isn't really impacted by any of this - occlusion is, but I answered this in my first post. Dividing the (playable)world into evenly spaced visibility-cells that are sized in meters (single digit) was viable as far back as PS2 era, with mere 32MB of ram, and it essentially eliminated the problem of discontinuity between regions.
To be extra blunt - PS2 titles I worked on literally made this transition from regions to cells and associated lack of visible pop-in in the next titles is quite clear.

And if it's not obvious the method translated perfectly into HD era and beyond - I've been involved in titles shipping with static occlusion of this nature well into 2013. What did happen in the past decade is that more focus was being put on dynamic occlusion approaches - and I've also seen titles where we did a crude manual version of region-occlusions as late as 2017 - mainly because the 'cutting edge' engine in question had not suitable facilities for occlusion on that scale, but I digress.
Point being that the specific thing the article describes has long been a solved problem - precomputing and dynamic occlusion aren't - but what is described in the article doesn't solve those either.

It is - however a very cool way to solve the same problem again, but I also mentioned rule of cool before, we're still on the same page, hopefully I don't need to keep referring to the same points.
It sounds like you're much more of an expert on this than I am, so I believe you, but if you have time, I'm really curious and I want to make sure I understand. Are you talking about the frustum and dividing it up into clusters? I've read about that before, but I apparently never quite understood it properly. This can be used for determining LOD levels? I never knew that. But if I understand you correctly you're saying there are problems with this approach when dynamic geometry is involved? I assume because the data needs to be very carefully organized in data structures?
 
Top Bottom