This is a speculation thread after all (and thanks for good discussion with Giordiemp in DM regarding this).
Speculation: I think Sony has a stacked chip in the PS5 where they have stacked memory on top of the APU die (or below).
Rationale: Sony has a history of stacking memory on top or below of the logic. They started with it in the PS Vita who's main architect was Mark Cerny. They have continued with it in designs such as sensors/cameras. Memory takes up a lot of die space in mm2 and stacking it allows it to be very close to the transistors that need it while making the logic die cheaper to fab.
It all started with a rumor years back from leakers that Sony had a collaboration with Micron to create memory to stack for the PS5.
Then we had 'The Road to PS5' where Mark Cerny stated that the I/O circuit allowed for direct file load into the GPU cache (most likely L2). This only makes real sense if the cache is large enough to both host ongoing work and have enough space to load a new complete file into it (such as a texture that was Mark's example). A 4K texture requires around 50MB and an 8K texture roughly double that (as was exemplified in the UE5 demo).
Then we had the cooling solution with cooling from two directions - top and bottom. And in addition liquid metal cooling from the top. As we now know that the GPU is not OCed but rather runs at AMDs intended frequencies one must ask why all this exotic cooling is required? Ultra-silence is all nice but at that cost?
Then we have the die size of the APU. It is very hard to put more than 16 MB of L2 cache on it before running out of real-estate in mm2 (with reasonable assumptions).
Then we have the need for a large cache to allow for RT with significant memory requirements to run intersects properly (BVH modelling etc).
Finally we have the teardown video of the PS5. The most edited section of the teardown video is of the APU. Sony has eased all marks on the APU digitally. Furthermore, the APU + socket is thick. Like really thick. One explanation could of course be the bottom cooling plate that connects with the other side of the board but that thickness does not make sense in my opinion. Compare the socket etc with the XSX and you will understand what I am talking about.
Conclusion: It seems likely that Sony has a stacked design. The most logical conclusion from that is that they have stacked shared cache memory to save real-estate on the main die and implemented a top and bottom cooling solution to make it happen. This would explain all the items above and be in line with Sony's chosen design paths in the past for several other chips.
I might be 100% wrong of course but it is a speculation thread after all
If Sony tells us that the GPU has more than 16MB of shared cache we know that the chip is stacked. There is no other way they can fit that memory into the design otherwise.