Whose fault is it ?
If you have followed us so far, we can begin to see the causes of the slowdowns that we notice in some benchmarks. If we summarize what we have seen so far:
- Ryzen memory latency is higher
- The bandwidth between the CCXs is particularly reduced
- Cache access between CCX is expensive
If one adds the tendency of Windows to move the threads permanently, one can begin to understand a little better certain behaviors. Indeed, strolling threads from one CCX to another is excessively expensive on Ryzen whose bandwidth between the CCXs is reduced.
One can easily imagine two cases that will cause slowdowns:
- Threads that, when roaming, pay a high price in latency to access the data of their caches
- Threads that, whether balladed or not, saturate the entire cache
In the case of 7-Zip or WinRAR, we will look at the second option, in practice these software programs use a particularly large compression dictionary to which they constantly refer. We assume that in this case the limited memory bandwidth between the CCXs is a limiting factor.
For games, we assume that we are closer to the first case, or a mix of the two.
Is the problem insoluble? Probably not, several solutions are possible, such as an adaptation of the scheduler of Windows to limit the movements of the threads out of the CCX, a little in the image of what was done to Bulldozer with the modules. One can imagine that AMD is working with Microsoft to implement a system of this type, even if the manufacturer could not confirm it.
Another change that could prove beneficial for games is the arrival of the "Game Mode" of Windows 10, one of the peculiarities of which is precisely there, to move the threads less. Other techniques of mitigation are possible and AMD should present several of them during a session at the GDC, we will talk about it again.
What portion of the gap will be caught by these patches, patches, and modifications? It is impossible to say and we must remain very careful about what the various developers will do or not. The reduced bandwidth between CCX and high latency will remain things that they will not evolve, at least not before a future revision of Zen.