You need to store the LLM on VRAM to efficiently run the models.AI demands so much RAM that AI boom skyrocketted the price. Im not much of a techy, but does AI really demand so much RAM?
That is just a troll listing. You can still get 32gb at fairly low prices elsewhere
This is insane!
No. Sold and shipped by newegg, I specifically included this in screenshot.That is just a troll listing. You can still get 32gb at fairly low prices elsewhere
VRAM is required in enormous amounts. Even when you use a quantized version of AI (quantization reduces the precision of AI: for example, twice less size at the cost of 5 to 15% precision; models can be quantized even more aggressively, but the smaller AI is, the more its suffers from quantization - all the way down to becoming practically lobotomized), it is still ridiculous. Say, for example, you have a relatively dumb local AI model with 24 billion parameters. It can run on a single RTX 3090 (24GB VRAM) at 32k context length (it's about 20k characters). Some other model, let's say with 32 billion parameters, requires TWO 3090's to run at the same 32k context length (especially if the model is more complex and has more 'layers').AI demands so much RAM that AI boom skyrocketted the price. Im not much of a techy, but does AI really demand so much RAM?
LLM? Hmm.You need to store the LLM on VRAM to efficiently run the models.
Large Language Model. Essentially need to store their whole algorithm (trillion of parameters) on the ram.LLM? Hmm.I will research about AI later while at the same time working. Got to sleep. Thanks.
Very interesting what you wrote. I will research more about AI later. Got to sleep now and back to work later. Thanks.VRAM is required in enormous amounts. Even when you use a quantized version of AI (quantization reduces the precision of AI: twice less size at the cost of 5 to 15% precision), it is still ridiculous. Say, for example, you have a relatively dumb local AI model with 24 billion parameters. It can run on a single RTX 3090 (24GB VRAM) at 32k context length (it's about 20k characters). Some other model, let's say with 32 billion parameters, requires TWO 3090's to run at the same 32k context length (especially if the model is more complex and has more 'layers').
Now, things like ChatGPT have trillions of parameters and they probably run at full precision (no quantization). Just imagine how much memory on the server do they gobble up, all things considered - millions of users around the world making requests at the same time, and such giants also have a huge context window size (going from 32k to 1m would increase VRAM consumption shockingly high).
As for the actual RAM (not VRAM), there are models that have "total" parameters and "active" parameters: like 100B total, 20B active. The former could be loaded into RAM while the latter (together with context) loads into VRAM - and it gives a rather slow (compared to full VRAM) but nonetheless usable experience.
Using ONLY RAM is never a good way, unless it's some kind of unified memory (think of iMac devices, or whatever they're called now - those with M1 - M5 chips).
Point is, it's going to get only worse, since those assholes developing AI can't find a way to make their models more efficient. Everything they release now aims to 'know everything' and they feed more and more training data into their AI. All enthusiasts who want to use AI locally, beg the developers to give them smaller and faster models. That's also where all the fears about "AI everywhere (like in Windows)" come from: people know that only gigantic, huge AI is somewhat smart. And you can't put such a thing in every PC around the world, not now in 2025 at least - so, if Microsoft ends up shoving some kind of agentic garbage down the people's throats, it's going to be erroneous, non-reliable trash that would cause more harm than good. Probably smarter than local 24B - 32B models, but still shit, since there's just not enough hardware and electricity to saturate the whole world with 'good' AI.
Good to know this. Makes sense cuz from what I've learned RAM+ CPU is way way inferior to VRAM+ GPU.As for the actual RAM (not VRAM), there are models that have "total" parameters and "active" parameters: like 100B total, 20B active. The former could be loaded into RAM while the latter (together with context) loads into VRAM - and it gives a rather slow (compared to full VRAM) but nonetheless usable experience.
Using ONLY RAM is never a good way, unless it's some kind of unified memory (think of iMac devices, or whatever they're called now - those with M1 - M5 chips).
Your obsession is becoming unhealthy.
If you think these ram prices won't affect PS6 or PSN prices, I don't know what to tell you.
They will but clearly it's not an imminent issue. If Sony thought it was going to be an issue for PS5 very soon they wouldn't be currently having aggressive discounts. They'd be trying to keep inventory so they don't have to manufacture as many PS5s with ludicrous prices.If you think these ram prices won't affect PS6 or PSN prices, I don't know what to tell you.
They porbably will next year around spring time, but in the meantime they managed to stockpile some ram before the craziness to last them months.If you think these ram prices won't affect PS6 or PSN prices, I don't know what to tell you.
As I was saying in the other thread, if you're thinking of building or upgrading soon you need to accelerate your timeframe and do it nowUh... This price hyke is until next month or so, right?
Right?!
You and me both man. It's not like modern gaming is giving me a lot of reasons to upgrade either.I guess I will be riding my 32gb with a 4070s rig until the damn wheels fall off