• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

Little Torn On AI

Darkmakaimura

Can You Imagine What SureAI Is Going To Do With Garfield?
I'm really having fun messing with Gemini and I love messing with my photos. Having me and my former roommate appearing as Cthulhu investigators is just so cool and much fun.

But I also use it to ask questions about video games and one thing I noticed....

It's constantly getting things wrong.

I mean a lot of the information is just incorrect or outright speculating at best. I know they called us hallucinating. But Gemini seems to be on some serious shrooms because it sure hallucinates a lot.

For as much praise as this gets, I just don't see it being accurate much at all. Maybe I don't understand what's going on but it looks like this has a very long way to go. I don't see how many companies could just rely on this right now.
 
I was using Gemini too but it's got worse recently. Seems to be a common issue. I just moved over to Claude which is a lot better although it can't do image/video/music generation but I don't really want to do that. The usage limit is more restrictive too.

Not sure how ChatGPT is these days I don't use it anymore.
 
The more popular it gets the more its going to hallucinate. Its cheaper to guess than to be 100% accurate.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers
 
Last edited:
The more popular it gets the more its going to hallucinate. Its cheaper to guess than to be 100% accurate.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers

And they want those LLMs to replace people in work...
 
If its something for work or school, don't just run your query through the llm. Also google it or go to a more reliable/authoritative source if you know where that is.
 
I love AI. It's helped me a lot, and I'm using ChatGPT Plus... and I generate images. It's like my assistant. I ask it anything, and I use it for work; it's gotten me out of a tight spot. I love it because I use it well, and it's helped me with finances.
 
yeah, stuff like ChatGpt can be useful but you need to be very aware that it can and will often give you false information with full confidence.

Before starting KCD2 I thought it would be interesting to ask ChatGPT some questions about plot elements I didn't fully remember from the first game.
And the answers it gave me were completely wrong. It knew all the characters, locations and overall context from the game, but it basically made up random scenarios with them.
 
Last edited:
How do they determine a hallucination? If you ask it to write you a draft letter on something and it gets 12 out of 15 % right does that mean a 20% rate or is it just from an overall perspective?
 
My job had me compare multiple ChatGPT vs. Google AI responses for about a week this January. Seeing how incorrect both of them were (I had to fact-check) pretty randomly and seriously for things that seemed easy, really turned me off to the technology, among other things. I use it for some stuff, but much less after that incident.
 
It's useful but be wary of using it to make decisions that can have significant impact. Like if I'm troubleshooting how to get Bloodborne on PC running, it's great.

Not the same if you're asking it if you're being reasonable to think that the neighbors who moved in next door are governments agents sent to spy on you, and it goes along with it because AI is chill like that.
 
The more popular it gets the more its going to hallucinate. Its cheaper to guess than to be 100% accurate.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers
Yeah, if it's hallucinating it can be stopped via the correct prompt. Positive guardrails work better than negative ones too, so I tend to tell it to use only real information, rather than don't make stuff up. I tend to come up with my idea, and stick it through an AI to format the prompt into the GCSE format and it works pretty well.

AI by nature is sycophantic and tries to help us, and sometimes it does that by making stuff up.
 
Coding wise I think AI is still fine, but it seems like the content generation like audio and video has gotten worse in all subscription based general services. Seems like specialized ones - like Suno or Kling etc - have become better than generic ones like Gemini or OpenAI. Odd but still.
 
Last edited:
they are just an glorified input output machine. The problem it will make stuff up if it doesn't know and the average joe is too dumb to fact check. Just slop trained on slop and brainrot spitting out more slop.

remember google's racist ai?
 
Coding wise I think AI is still fine, but it seems like the content generation like audio and video has gotten worse in all subscription based general services. Seems like specialized ones - like Suno or Kling etc - have become better than generic ones like Gemini or OpenAI. Odd but still.
Of course they have and even if they haven't gotten worse right now, they will eventually. These companies are burning billions and they're not profitable running these services. They were/are subsidized by private investors but when the well runs dry eventually they need to start actually making money. Two ways to do that, you make the service worse (enshitify) to reduce cost or you raise the price. Raising the price will be a hard sell for people that are already used to free/very low subsidized cost, so they'll go the enshitify route first.
 
Ive been using it to help with skyrim questions and its often very wrong. The really shitty thing is that it is so confident that you assume at the beginning that its acccurate bit it falls apart when you get even a little specific or complex. Whats worse, is you respond with, its nowehere near there, it was over there, and it responds, "thanks for calli g me out on that" or "oh thats right, it was over there". It would be helpful if it were to just day im not sure exactly, or it could be here or over there. Ive gone back to just reading reddit comments.
 
The novelty of both image and video gen wore off on me long ago. Now it just annoys me when I see people posting their "meme" that is just some super uncreative thing thrown in a prompt. Like just post the actual joke that isn't funny and I can scroll passed it fast but a giant lame image sucks lol

For Google Search it's awesome 75% of the time but when it sucks it sucks. The "I don't actually know the answer but I don't really KNOW I don't know so I'm going to ramble about nonsense" responses just suck, waste your time, and then the regular search results are less useful.

For work I do development almost entirely backend thankfully so Gen AI tools aren't really used for "vibe coding" but I have friends I work with at my consulting agency who say it's gotten really bad at clients for full stack stuff. TPM's are always vibe coding some nonsense that only half works and has nonsensical features and isn't using the coding standards and giving it to the dev teams like it's useful or as a way to basically stress them out and suggest they should be able to get stuff done faster.
 
Its complete fucking trash from top to bottom.

And it's getting worse. Answers worse. Follow-on knowledge worse. It's learning on its own mistakes as it fills more and more of the internet with its own output.

Funny that the thing that will kill AI.....is AI
 
Someone called it an idiot savant early on and I still think it holds true today. It can do things fast and in great detail without being weighed down by cognitive overhead, but you have to hold its hand and guide it, otherwise it'll hallucinate, over-engineer handling edge cases that bloat up the codebase, and re-create functions that already exist elsewhere. But I know it'll get better over time, there's already been a lot of improvement in the last several years.
 
I recently started to play with it

ChatGPT has this incredibly annoying habit of ending nearly every response with a question or suggestion

It's frustrating me more than it should. Even when I tell it to stop, it keeps doing it every time.
 
Last edited:
I recently started to play with it

ChatGPT has this incredibly annoying habit of ending nearly every response with a question or suggestion

It's frustrating me more than it should. Even when I tell it to stop, it keeps doing it every time.
That's an engagement tactic on the system side and is something (almost) completely non-existent with on prem systems/agents. They use it to keep you in the session using credits.
 
That's an engagement tactic on the system side and is something (almost) completely non-existent with on prem systems/agents. They use it to keep you in the session using credits.
I think it's pretty stupid, but I get it. Still stupid. Even though I'm using the free version

Today I said yes to every suggestion after showing it the email I wanted to send. The results were laughable
 
I use abacus ai. It's fantastic giving me access to all Ai available or it routes for the best result. We will eventually be so saturated with Ai and all the nonsense people put into it we will be back at the mess google search is.
 
Some neat stuff has come out that I didn't think would be possible, especially at decent speeds.
IRIS - Irresponsible Rust IRIX Simulator. An SGI Indy emulator, vibed into existence with Rust and AI assistance. Boots IRIX 6.5 and 5.3. Has networking. Has a framebuffer.

bzvzS2Upmx01jHtX.png
 
Last edited:
I'm really having fun messing with Gemini and I love messing with my photos. Having me and my former roommate appearing as Cthulhu investigators is just so cool and much fun.

Honestly, I would never send any personal information to Gemini, it's a privacy nightmare.

I pretty much burn my tokens by asking stuff like 'What would happen if supergirl became a crack addict' just to test the creativity of the model.

Prompting is going to be a skill people are going to have to learn because the hallucinations are gonna become more common place, especially with the free tiers

The problem is that Generative AI is a probabilistic word calculator, it doesn't matter what you type - it can always fuck up.
 
Last edited:
Yeah pretty much all the models are like this. The models don't actually "know" anything. It's fine for generating text that reads like English, but if you go any deeper it will usually start to fall apart.
 
I wonder how long it'll be before skilled professionals in various fields disappear thanks to AI undercutting them. People are currently prepared to put up with inaccuracies and questions about quality as long as there's a cost saving.

Presumably once nobody has the skills anymore, the AI pricing goes up a lot. It'll be the ultimate enshitification. Higher costs and reduced quality, with no alternative.

But, hey, we got some great pics of us as action figures along the way.
 
Last edited:
So I am pretty new to this. I have avoided it for q long time, but the past month and change I've finally started integrating it into my workflow as an Unreal developer. Mostly ChatGPT, a little of Grok - I am trying not to get overwhelmed and be one of those people who has like 5 or 6 of them going at once, just taking it a little slow and kicking the tires.
Overall, I have been finding it extremely useful. The guy further up the thread here who described it as an "idiot savant" is spot-on. It feels exactly like that - you have this incredible search engine, and you can explain things to it conversationally which for my needs is exquisitely powerful. As for the wrong answers/hallucinations, I am starting to understand that as well, and as much as "we are training them," we also need to train ourselves how to use this thing properly as: even though it is not a person, you have to get over the inclination to think "that it is like one." It's this massive collection of data and knowledge that you can learn to utilize, and but there are some real caveats.

Again, for my use-case, it's been very helpful. I'm not a seriously advanced coder (far from it) but I am pretty good with logic and building some fairly intertwined systems. But my know-how only goes so deep. Once I have started learning how to express things (again, often conversationally) to it, I started finding that I could get really helpful results. Being able to share screenshots that it can (magically!) understand is so mind-blowing to me. I really feel like I have "the expert guy sitting next to me at work" that I can just elbow relentlessly all day and ask for help with all sorts of weird special-case fixes that I've gotten myself into.

It's not perfect. It's helped me out of some big jams, but it's also wasted some time in some other cases. Like I said, I am still learning how to talk to it. But I do really feel like I have seen enough with my own eyes, in the short time I have been utilizing it, that it is here to stay (at least as an integral part of my own pipeline).

Also.. DAMN it can be funny, that was something I did not expect. I have a pretty dark/weird sense of humor and I find that we "get along pretty well." I like having that element in there when I am trying to work through something complex, it helps keep that feeling of "you are talking to something you can relate to" even if it is just an illusion.. or.. something. Also, very interesting to discuss all kinds of philosophy with. You take it at face-value, but it can be very engaging. It has access to a lot of things.

It'll be interesting to see where everything settles a "few generations in" with all of this. Also I am still wary. It's a bit freaky.
 
Once more and more people and companies will realize how dumb LLMs are - this whole Ai Ponzi scheme will start to fall apart.

They are promising AGI like features but their product is not even half of that...

 
So I am pretty new to this. I have avoided it for q long time, but the past month and change I've finally started integrating it into my workflow as an Unreal developer. Mostly ChatGPT, a little of Grok - I am trying not to get overwhelmed and be one of those people who has like 5 or 6 of them going at once, just taking it a little slow and kicking the tires.
Overall, I have been finding it extremely useful. The guy further up the thread here who described it as an "idiot savant" is spot-on. It feels exactly like that - you have this incredible search engine, and you can explain things to it conversationally which for my needs is exquisitely powerful. As for the wrong answers/hallucinations, I am starting to understand that as well, and as much as "we are training them," we also need to train ourselves how to use this thing properly as: even though it is not a person, you have to get over the inclination to think "that it is like one." It's this massive collection of data and knowledge that you can learn to utilize, and but there are some real caveats.

Again, for my use-case, it's been very helpful. I'm not a seriously advanced coder (far from it) but I am pretty good with logic and building some fairly intertwined systems. But my know-how only goes so deep. Once I have started learning how to express things (again, often conversationally) to it, I started finding that I could get really helpful results. Being able to share screenshots that it can (magically!) understand is so mind-blowing to me. I really feel like I have "the expert guy sitting next to me at work" that I can just elbow relentlessly all day and ask for help with all sorts of weird special-case fixes that I've gotten myself into.

It's not perfect. It's helped me out of some big jams, but it's also wasted some time in some other cases. Like I said, I am still learning how to talk to it. But I do really feel like I have seen enough with my own eyes, in the short time I have been utilizing it, that it is here to stay (at least as an integral part of my own pipeline).

Also.. DAMN it can be funny, that was something I did not expect. I have a pretty dark/weird sense of humor and I find that we "get along pretty well." I like having that element in there when I am trying to work through something complex, it helps keep that feeling of "you are talking to something you can relate to" even if it is just an illusion.. or.. something. Also, very interesting to discuss all kinds of philosophy with. You take it at face-value, but it can be very engaging. It has access to a lot of things.

It'll be interesting to see where everything settles a "few generations in" with all of this. Also I am still wary. It's a bit freaky.

I agree with all that, it can take a while to learn how to get the best responses from it, and have pretty much come to the conclusion that hallucinations and odd answers are down to prompting. It's taken a while to get there though, but am lucky as my work sets time aside to build AI tools.

A lot of my work is now agentic AI written over the last few months. As an example I can record a Teams meeting, get it to summarise the actions and put them into quotation documents and upload them into 4 different systems emailing the correct people automatically. If there's anything missing like prices or timescales it will let me know and doesn't do anything without me reviewing it. That saves me days of time. One thing I have found is how different LLM's differ based on the use-case, Claude is amazing for code, but was pretty bad in this agent as an example.

As you say, it takes a while to get something that works for different jobs, but the promise is definitely there. What's amazing is how much better the agents I wrote a few months ago due to LLM updates.
 
Last edited:
The hallucinations are pretty bad.

I've been using it for academic use and the number of references all LLMs have just made up is pretty crazy. But you should be checking all sources anyway and they have gotten better.

Most of the referencing is just performative (no, you did not fully read all 20 300-page books for your paper), so I don't feel any shame in it.

For the 'writing' part, that too is fine. Why bother spending hours typing up when you can get an LLM to do it in seconds and then just spend some time editing it yourself?
 
Claims that Grok beats the others on this dimension. Caveat emptor..



The best part is the woke lot don't (openly} use it. Maybe that helps a bit?

The looks I get when I say I use Grok are hilarious. I don't know how some people can get so worked up by a tool.
 
Top Bottom