I totally forgot about mesh shading. 3Dmark benchmark suggest insane performance gains, so I wonder if this feature was implemented in some games already?
It's definitely in the Matrix demo, at least parts of it. But I'm not sure if any commercial games are currently using mesh shaders (or primitive shaders, for that matter). They are probably still exclusively on the fixed function graphics pipeline since most games have been cross-gen so far.
AV1 is taking so long. There will be some great new possibilities once it's out.
rtx-30-series-av1-decoding
www.nvidia.com
"We are working with Twitch on the next generation of game streaming.
AV1 will enable Twitch viewers to watch at up to 1440p 120 FPS at 8mbps" or greatly improved iq but still 60 fps.
Oh that's gun b gud. This should also hopefully trickle down for lower resolutions and framerates too (I usually set Twitch streams at low bitrate unless there are certain moments I'm actually watching more attentively, then I might jump the resolution up to the source. Otherwise I treat them like audio podcasts).
Hopefully Twitch changes the audio bitrate at lower resolutions; just do them on two different encode paths like how Youtube does.
Sorry for the noob question, but how does memory bandwidth work? On Nvidia it will give a number like 9500mhz, and on AMD it will give you 2000mhz. Then they say AMD has 16gb/s and Nvidia has 19gb/s, but then they talk about bandwidth being upto 1tb.
Personally I don't even look at the memory controller clocks when it comes to GPUs, just the bus size, and module bandwidth.
If you have, say, a 14 Gbps (gigabits per second) GDDR6 module, like the current-gen systems do, then that means each I/O pin on the module used for data can transfer at 14 Gbps, or 1.75 GB (gigabytes) per second (divide any ****bit amount by eight to get the ****byte, there are eight bits in each byte). Then multiply that by the number of I/O data pins; GDDR modules are 32-bit so they have 32 I/O data pins. That's how you get 56 GB/s module bandwidth.
Then look at the bus size; these are also in bits. The PS5 has a 256-bit GDDR6 memory bus; since they aren't using clamshell mode then each module runs at the full bit rate (32-bit, vs. 16-bit for clamshell configurations). That means you can put eight 32-bit modules on the bus. So multiply the module bandwidth by the product of the bus size divided by the module bit rate (in this case, 256/32 = 8) and you get 448 GB/s, the GDDR6 bandwidth in PS5.
You can use that same method for figuring out Series X, Series S, and pretty much any other modern GPU. HBM designs are different because they have a lot more data I/O lanes (128 vs 32, therefore they are 128-bit memory devices vs. 32-bit memory devices) and are designed for stacking via TSVs (through-silicon vias) in typical stacks of 4-Hi, 8-Hi, 12-Hi and (supposedly, for HBM3) 16-Hi. Stack sizes tell you how many modules are in the stack; you can then look at the capacity per module multiplied by the number in the stack to find out the total capacity per stack. You can also multiply the per-module data I/O pinout amount by the number of modules in the stack to determine the bus size of the stack.
For figuring per-module bandwidth you just use the same method as for GDDR memories. DDR system RAM is a bit different; you actually want to use the module speed, usually expressed in MHz (some also express it in MT, or megatransfers). For example DDR4-3200 is 3200 MHz; you multiply the 3200 MHz by 64 (the number of data I/O bits for DDR memories), and then divide that amount by 8 (translate the bit amount to a byte amount) for 25,600,000 MB/s, or 25.6 GB/s. Some people would split the 3200 MHz by 2 since the memory clock is technically 1600 MHz and it's doubled due to the way DDR works, but if you already know that you can skip that step.
Since all parts of this is apple, and granted its ARM based, can Intel and AMD do the same thing to knock out NVIDIA for x86? I got a strong feeling it's headed this way.....
They're already trying to do that xD. Take a look at AMD's M100 designs, that is an indication where RDNA 4 and especially RDNA 5 will go design-wise. Intel already have Ponte Vecchio and their own MCD designs going.
What I'm more interested in is if (or more like when) AMD, Intel & Nvidia move away from GDDR for mainstream GPUs and start using HBM. And I'm especially interested if any of them design GPUs in the future around HBM-PIM technologies because that will probably represent another paradigm shift IMHO (and 10th-gen consoles would benefit a ton from it as well).