The fish are cases when pages in system memory are updated by the CPU, GPU or SSD and the caches being used by the GPU have to be either completely cleared, or just the bits that have been modified can be snipped out instead.
It was late and I wanted to make it intentionally wacky to drive home my point that it's just an analogy to convey the kind of concept Matt was talking about for someone that perhaps wanted an appreciation of it while not being able to understand the technical side of things. Analogies can always be taken too literally and stretched too far, so by making it this silly it's more difficult for that to happen.
Something a bit closer to reality would be two different mathematics factories exploiting child labour, where one has 36 desks of kids that can do 223 calculations per minute per kid, and another has 52 desks of kids that can do 182 calculations per minute per kid.
If everything about how the factories were laid out and run was equal, then you'd be able to compare their peak number crunching output by multiplying amount of desks and calculations per minute.
Each kid (CU) sitting at a desk (cache) has to get up and go and get a new list of sums from the next building (system memory) whenever he has finished what he has been allocated.
Sometimes a manager will come in and confiscate the list of sums from a kid that was working on them, and tell them it's out of date, and that they should go and get an updated version of that list from the next building and then carry on from where they left off.
In the factory with 36 kids, they walk to the next building to get a new list of sums at 4.48mph all of the time.
In the factory with 52 kids, they walk to the next building to get a new list of sums at an average speed somewhere between 5.6mph and 3.36mph.
Do the simple math on these two identically run factories of kids and how quick they are and the factory with more kids brute-forces its way ahead, despite having slower kids. More number crunching, good.
But what if the factory isn't run the same?
What if in the factory with 36 kids the manager doesn't come in and confiscate a list of sums currently being worked on, but just leaned over the desk and crossed out the one that is out-of-date on that list so the kid could keep working?
Then you can't just multiply kids by kid calculation speed to compare the factories.
It also means you can't really compare the speed at which the kids walk, if in one factory the kids have to be walking a lot less. If you're slower at walking, but don't need to walk as often, you spend less time walking overall, and so don't need to walk as fast to do the same amount of work.
It also means for a given amount of kids sat at desks, you get more sums crunched under this type of management. It's more efficient to the point a 36 desk factory will perform closer to a factory ran the old way with more desks.
A bit less fishy, now!
What I don't know is how often GPU caches get invalidated in typical game code. What I don't know is how much difference this actually makes in typical game code. A 36 CU GPU with Coherency Engine & Cache Scrubbers performs like a ?? CU GPU working the way things have always worked.
It could be a tiny efficiency gain that's nice to have, or it could be really significant.
Matt seemed to suggest that it's more significant than people realise, especially as the time available to render a frame get smaller at higher FPS.
It could be that XSX scales better at higher native resolutions and PS5 scales better at higher frame-rates, which is more critical to something like VR.
It could be that in typical game code the difference in performance between the two is significantly less than the 15-18% difference you get just by counting CUs and clock-speeds and ignoring the faster and more efficient caches. All of this is conjecture and something I don't know and am not qualified to predict. I'm excited to see how it pans out.
In my opinion this doesn't mean the PS5 GPU has now closed the gap, but from what Matt was saying the gap might be not only smaller than some people are throwing around as fact, but that in some scenarios of graphics rendering the PS5 GPU is actually better.
I don't think either side is getting particularly short-changed on graphics this time around, but that's just my hunch, and we'll all known soon enough.