Again, clearly you haven't studied anything in the philosophy of language or linguistics. Effective use of
natural languages (and when I say effective: GPT4 scores far better at writing coherence and sophistication in many languages than average native speakers) is not based on grammatical or syntactic rules, those only get you in the door.
If you know anything about translation, you'll know that high-level translation is not a systematic game of rules at all, it's a matter of understanding--
possibly the greatest test of true understanding. Read the major authors in hermeneutics to learn more, like Gadamer or Ricoeur, to hear how this is well established in the philosophy of language.
Being able to move ideas between wildly different languages -- and (as I stated above and you ignored) being able to translate between modes of expression, even for instance turning a complex academic argument into an intuitive explanation of understandable simple metaphors for a layperson -- requires
understanding the content itself. It requires grasping the tone, the hidden implications and subtexts that inhabit different languages, etc.
GPT doesn't translate in a rule based way; it is foundationally multi-lingual in the sense that it can accomplish any thought or task in countless languages natively, which requires abstracting meaning over the form of language.
This is another extremely polemic paper from someone who is writing in essay fashion, fun for his tone but not very interesting. I also know his type well; he adheres to the roughly American "analytic" school of thought on language, as is evident in the extremely narrow confines of what he considers to be reason. Fortunately, few take that tradition very seriously anymore on its own narrow terms.
(I mentioned Wittgenstein earlier for the same reason--despite starting in the analytic tradition and being a star of it, his final works made a radical turn into grasping that language is
not a correspondence nor a logical form or structure on top of the perceived nor on top of some substrate of logical reasoning. His re-conception of "language games" blows apart that kind of thinking, and puts you into the territory of understanding the hermeneutic authors mentioned earlier.)
So how do you make GPT4 reason well? First you need to understand what reasoning actually is. It's a
game in language. It's
not a mental process that is later expressed in language; it is instead a particular kind of language use.
How do you teach a student to reason? One of the best ways--and oldest--is to illustrate the Socratic method, so that your students learn to use the process of question and analogy to refine their thinking, expose contradictions, and reach a new insight or refute a point. They learn--through language--to work out the truth. GPT models similarly can do excellent reasoning if you give them the form of Socratic dialogue (eg.
https://arxiv.org/pdf/2310.00074) or, if you give them similar variations on chain-of-thought so that it walks through solutions piece by piece.
The one thing to understand is that of course an LLM has no
internal monologue. So when it works out any answer--even a trivial one--it must do so out loud, through the medium of language in the dialogue. Those who understand this are able to prompt these models to reason out-loud through extremely complex and nuanced problems in order to reach conclusions and insights.
On the contrary, those who try to elicit an immediate response from the model without steps--and then try to use this as a gotcha--have so poorly understood both the models and human reason that they don't have anything to contribute to the debate.
In any case, above I have already given concrete examples of GPT4 reasoning with me in full transcript form, across highly technical conversations both
in machine learning theory and
in math, both obtained quickly by simply having a conversation with it while writing to you here
. In both cases, if you read the conversations carefully to the end, I asked it to apply its knowledge to novel cases. Such examples are readily available everywhere.
If
you are capable of basic reasoning then you must know that the burden is on you to disprove all cases of it using high-level reasoning, not to find failures; even high-level humans with upper intelligence make constant low level mistakes, so mistakes have
no relevance to the debate at all. So your best argument is "sometimes it fails to reason correctly in these cases," and that's not a refutation of its ability to reason--especially since
every one of those "failures" can be trivially made a success when the model is prompted correctly to walk through the chain of thought out loud.
GPT4 has even notably shown that it is capable of passing all parts of the bar exam (
https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2023.0254), which is one of the foremost tests of applied reasoning, where it has not seen the particular questions on the exam which are renewed each year in order to force the student to think through novel situations.