Wrong. No matter how many experts keep proving you wrong, you people just keep up with the delusions. They are only intelligence if you consider by-hearted knowledge intelligence. They have absolutely no reasoning capabilities. They are merely a glorified collection of look up tables.
First of all, I'm in the field, and I've been working with language models since
before the transformer architecture even appeared. And I've worked with every part of the newer generations of models, from training & fine-tuning to real applications in software. I can explain the entire stack of a model like GPT4 from scratch on a whiteboard from the low pieces, and I've done that many times for demonstration.
You have this extremely weird sense of what the "experts" believe, based on cherry-picking and misinterpreting a few louder voices with their own well-known agendas. For instance:
Even yann lecun, the chief ai engineer of meta has stressed this again and again. The only ones still driving the delusion are the grifters at open ai who have a vested interest in hyping up their mediocre stuff
Yeah we all know what Yann LeCun wants, and he makes exaggerated arguments to further it. He's obsessed with trying to achieve a more human-like AI stack in the direction of the mythical AGI, and isn't impressed by anything he considers halfway. Okay.
Even the much more consequential figure and primary colleague of LeCun during his heyday -- Geoffrey Hinton, who has developed
much more significant work across all areas of machine learning for all these years, but notably was the other researcher on the project that brought LeCun fame in the first place -- disagrees openly with this and repeatedly says that LLMs are doing reasoning.
LeCun is the kind of guy who says controversial things to get into the normie press on sites like Wired, gets the attention of people who aren't in the field and aren't experts. Hinton is the guy who wrote half the seminal papers on actual machine learning innovations over the years and the guy you'll end up citing repeatedly in your footnotes if you do real work.
But you wouldn't know this because you're clearly not someone who is in the field or who has ever trained a model.
They are merely a glorified collection of look up tables.
Back to this statement: totally false.
A lookup table cannot synthesize its knowledge intelligently. An LLM
does, and in rather incredible fashion.
Just to clarify a little since you don't know how next-token prediction actually works: a better way to understand the model as it moves from new word to new word is in terms of the "momentum" of the text. The attention mechanisms create something like a flow of information across the tokens/words, so that at any given moment, the "direction" of where the text is going is already deeply contained within that momentum. When an LLM unrolls its next paragraph, its response to a question, etc -- it may drop it as one word at a time (*although even this is false because we use beam search which superimposes multiple paths)
but it is unrolling a momentum and a general destination (eg. where this argument/answer is heading) that is already contained in its forward pass through the prior tokens.
A good comparison: these models (transformers) were originally developed for the task of language translation. So they had 2 halves: one network that deeply
embeds the source sentence into a dense meaning space, then a second network that
unrolls (generates one token at a time) the same sentence or passage in the target language. If you understand that stack, you'll understand that the deeply embedded content that passes into the decoder side of the network already contains the full "momentum" of meaning that is to be unrolled in the target language. LLMs like GPT are in many senses similar, so that when it answers your question, it is 100% false to say that the network only "knows" the next token. No, it has a strongly directional momentum and meaning already driving it towards a goal, which is why it can compose those next tokens in a way that faithfully unrolls its complex response.
And again, across many kinds of extremely difficult tasks that require
much more than recall or lookup, these models already exceed most humans.
edit: and one more comment since you like to link "planning" related papers. That narrow emphasis on agent-like planning is of a piece with LeCun's mistake, which is to only understand intelligence in terms of the way humans consciously think. That's not how LLMs or other AI models think; they have a radically different paradigm, and it is totally devoid of "self," center, agency, etc. Yet the way that they reason is actually far more powerful--for many kinds of tasks and conceptual problems--than the limited human conscious frame of plan and intention.
For instance, an extremely difficult kind of reasoning task is to read a complex short story (complete with authorial tone, irony, etc) and intelligently respond to complex questions about the characters, their intentions, the author's intentions, the subtleties of the interactions and hidden metaphors, etc. Do you think you can beat GPT4 at this? Be careful, because you very well might not. At this kind of complex comprehension--even involving extremely complex nuances of characters, genre, metaphors, tone--the latest LLMs can easily beat most humans. If they don't already beat you personally at it, they almost certainly will very soon.