But surely running at 10.23 TF at all times if needed would be better? I don't get how varying clocks in a predictable manner could be better than the full 10.23 TF at all times.
Full CU utilisation in a synthetic workload (even for the tiny period it would get close) produces X heat, and requires Y power at fixed clock 1, and in this situation both X and Y are much larger than what can be cooled and the power that can be supplied in a console.
The current gen consoles predict real-maximum utilisations expected by end of a generation - less than half full from Cerny's "doing well if hitting 40%" quote - so adequate power and cooling is provided to meet those predictions, and will have console firmware shut the device down if the thermal range and power usage are exceeded - so something that on working retail hardware would get flagged as bugs by QA in development, so games don't release that trigger theses conditions, as these are constraints for programmers.
This is how the XsX is still designed, and the PS5 doesn't need to worry about that, because such code that didn't get caught in testing or certification testing will be handled more gracefully, according to Cerny. And that is because the system deterministically balances power versus utilisation with variable clock frequency.
Take the scenario where an algorithm that downclocks the PS5 GPU uses lots of matrix multiplications (that are a critical path calculation) and then the algorithm gets optimised - post launch in a patch.
The developer realises they can rearrange the algorithm and reduce the matrix multiplications by pre-calculating the parts that are constant into less multiplications - thereby reducing the CUs utilisation for the task.
The result on PS5 is that the algorithm can either remain at static performance and have another item fill the unused compute or the algorithm gets clock boosted and completes the critical path calculation earlier, thus allowing for the next task to start earlier. So in either situation a performance gain.
On the old fix clock design the algorithm just uses less power and performance remains static.