Monday, January 26, 2026

Another Bad Critique of AI, in the New York Times

Want to get your uninformed opinion about AI published in the most prestigious newspaper in the United States? Then follow my advice.

First, it helps if you have little or no perceptible training that would give you the necessary background to critique AI. For example, perhaps you majored in history, instead of relevant fields like computer science, or philosophy, or neuroscience. Or maybe you went on to study something even more unrelated; let's say, getting a master's degree in business administration. Why should you know anything about the subject you're critiquing? The New York Times certainly won't care, and neither will your readers.

Second, have a vague thesis, perhaps something like humans have capabilities machines will never have. Be sure to support that thesis with unbacked assertions, confidently proclaiming things like "judgment cannot be automated" and "Judgment ... is a uniquely human skill". It is important to not specify how you know claims like this with such certainty. Do not refer to any published studies or journal articles backing up your claims. Uniquely human? Don't refer to the vast ethological literature, where examples of judgment in non-human animals have been documented again and again. And above all, do not point to all the prior claims about abilities machines will never, ever have---claims that have been proved wrong over and over again!

Third, you should relate some anecdotes. It's important to have anecdotes, rather than any deep investigation or hard data, because people love stories, especially stories that reinforce their prejudices. But it's also important to have those anecdotes as hazy as possible, so that no one can check the details. For example, just talk vaguely about the failures of an "A.I.-assisted analysis" or "an A.I. model" without specifying precisely what software was used and when. After all, there hasn't been very much progress in AI lately. Two anecdotes should be more than enough.

Be sure not to explore any alternative explanations of why the unnamed "A.I. model" failed to give useful advice in your particular situation--that would be too scientific. For example, if it was because "The chief executives of the two companies had a fraught relationship", be sure not to specify if this crucial fact was included in the training data. This will allow you to triumphantly conclude that "The model could not account for the interpersonal dynamics".

Finally, if your argument consists mostly of things like (in my words) `here are a couple of times where AI gave the wrong answer to my business issue', certainly do not mention that humans give wrong answers to business issues all the time. It is vital not to refer to any data-based comparison of the capabilities of humans and machines, because that might cast doubt on your thesis.

If you follow my advice, millions of people will read your timeless prose. And afterwards, you can get back to the vital work of investment banking.

Sunday, July 13, 2025

Yet Another Bad Analysis of AI

In the past I've commented on bad discussions of thinking, intelligence, brains, and computers, such as those by Gary N. Smith, Doug Hofstadter, Arthur Gardner, and Robert Epstein.

With the rise of LLM's like ChatGPT, the commentary has only gotten worse. We see the same kinds of mistakes and bad argumentation tactics that have been used for decades. Here are some of them.

  1. Commenters will always say things like "It looks like an LLM is thinking, but it's not really thinking." But they'll never explain what the difference is between "thinking" and "really thinking".
  2. Commenters will always say things like "It's not really thinking, it's just X", where X is some mechanical or physical process. This is a mistake that goes back to Leibniz, and just amounts to a categorical denial that atomic computational processes can give rise to minds.
  3. Whenever a new advance in AI comes along and falsifies a prediction along the lines of "computers, lacking true intelligence, will never do X", commenters won't admit that they do have "true intelligence". They'll just say, "Oh, I guess I was wrong, doing X doesn't require true intelligence."
  4. Commenters will use vague words like "understand" and "think", but never provide any definitive tests by which we could determine whether a computational system does these actions.
  5. Commenters will always compare AI to only the most competent humans, never to the average person. For example, when ChatGPT matches or slightly exceeds average human performance on some benchmark, this will not be regarded as evidence of thought. "Their results are not always trustworthy!" people will say of LLM's, ignoring the fact that neither are people, encyclopedias, or any other source of information we depend on.
  6. Commenters will claim that some new AI implementation lacks some quality of humans. Maybe it's that the AI is not "embodied", or "lacks feelings", or "doesn't have beliefs" or "lacks ground truth". But they won't provide any clean, direct argument that these things are needed to be think or be intelligent. It's the old "airplanes aren't really flying because they don't flap their wings like birds" fallacy. This is where a background in the theory of computation helps. One of the first things you learn is that one computational model can simulate another. Having gears or transistors or neurons aren't fundamental aspects of computation.
Here is yet another example. There is a new online course by Carl T. Bergstrom and Jevin D. West, professors at the University of Washington. These are distinguished scholars, and, as you might expect, their analysis is not as bad as some of the others I've cited above. But it is noteworthy that neither of these two professors has a background in neuroscience, the theory of computation, or machine learning.

This is not to say they are entirely wrong. Some of the things they say make sense. They talk about the limitations of some current LLM's, and they warn about the dangers of relying on current LLM's without checking their results independently. They caution that putting LLM's in charge of critical systems is a danger. They argue that use of LLM's by students may result in decreased educational gain. I agree with all these things. But they also make mistakes and they use some of the tactics mentioned above.

I will quote some passages from the current (July 12 2025) version of their online course, followed by my commentary.

#1: "Given a string of words, you guessed the next one in the sequence. This is basically all that ChatGPT and other LLMs are doing."

This is not true, or at best, an oversimplification, and it illustrates one my main criticisms. They make assertions about LLM's that are meant to apply to all such models, but they don't. For example, o3 can do much more than simply "guessing the next one". It appears to do step-by-step reasoning, and it can back up its claims with cited references. Furthermore, it narrates the steps of its reasoning so you can follow along.

This assertion of Bergstrom and West is also a classic example of the Leibniz mill fallacy, one that was well-satirized here.

#2: "They don’t reason the way that people do. They don’t have any sort of embodied understanding of the world." and then later on "It doesn’t ‘think’ about your question the way a person does."

Perhaps this is true, perhaps not. But they don't know with certainty, for example, that next-word prediction is not part of most human interactions; I would argue that it is! But the assertions are also misleading, because there is an implication hiding in the background, something entirely missing from their argument: namely, even if we grant their claim that LLM's don't reason like people do and don't have an "embodied understanding of the world", it doesn't follow that this is some really fundamental limitation on their capabilities. LLM's produce correct results at a rate that can't simply be explained as chance alone.

#3: "They don’t even have a fundamental sense of truth and falsehood."

This criticism fails because they don't say what it would mean for anything (a person, a machine) to have a "fundamental sense" about anything. Suppose I say in response, actually, some LLM's do indeed have a "fundamental sense of truth and falsehood". How could we test that assertion? What method would they use? I think the assertion can be judged as "not even wrong", to use a phrase of Pauli.

#4: "But don’t let the impressive capabilities of LLMs lure you into thinking that they understand human experience or are capable of logical reasoning."

Two claims, one that is simply wrong ("not capable of logical reasoning") for some LLM's and one that is not even wrong, namely that they can't "understand human experience". The reason why it is "not even wrong" is that they offer no definition of "understand" for which we could carry out a test. "Understand" is such a vague word that one can make such assertions and not back them up.

#5: "It's not intelligent. It doesn't understand anything. It doesn't think."

Yet more assertions made without any supporting evidence. And assertions using words with vague, complicated, and multifaceted meanings like "intelligent", "understand", "think".

I would argue that, in any reasonable understanding of these words, some LLM's are intelligent. It is reasonable to say that many of them do understand things, and that many of them do indeed think. The fact that Bergstrom and West themselves use these words and then characterize this use as "fall[ing] in[to] a trap" illustrates that the colloquial use of these words to describe what LLM's do is, in fact, quite reasonable.

But if we wish to argue about these things, I think there is a need to provide some definitive tests by which we could decide,

* is this particular LLM intelligent? How intelligent?
* does this particular LLM understand anything?
* does this particular LLM think?

Another problem with these assertions is that they seem to suggest that, for example, something is either intelligent or it is not. This is the kind of black-and-white thinking that pervades so much of the discussion about thinking machines. I have criticized this in detail in a number of my blog posts here. Why could it not be that LLM's display an intelligence that perhaps does not yet match humans in some areas, but outperforms them in others? Why must intelligence be reduced to one single capability? Why do we not measure it on a continuum? Intelligence is multifaceted and many people have written about the multiple kinds of intelligence.

#6: "Teddy: Should I trust them?

Carl: No.

I think you can use them to suggest things to try, suggest questions to ask, suggest things to research, but you can't trust them to give you correct answers."

This is interesting for me because it illustrates yet another way there is an implication behind the words. Here the implied syllogism seems to be "because the answers LLM's can't be trusted to give correct answers all the time, they can't be trusted or useful at all". And yet exactly the same thing is true of people, encyclopedias, dictionaries, newspaper articles, books, scientific journals, and all sorts of other ways we gain an understanding of the world! What academic does not have a colleague who makes pronouncements with confidence on a broad range of issues in and outside their own competence? Such a colleague is very often right, but sometimes wildly wrong. Should we then condemn all academia in the same way? We always have the same problem: we need to check the answers, whether provided by an LLM or a person.

Whenever you come up with a criticism of LLM's that is also applicable in exactly the same way to people, then you're not really criticizing LLM's at all. You're just faced with a fundamental problem of epistemology.

#7: "These systems have no ground truth, no underlying model of the world, and no rules of logic."

Well, once again, a combination of what I would consider false statements and statements that are "not even wrong". No underlying model of the world? Of course they do. They have a model of the world based on reading literally millions of texts. If we were communicating with aliens light-years away, their understanding, their model of our world would be similarly based. No rules of logic? That's another example of a claim made but not supported. How do you know that a prediction model can't result in reasoning that is logic-based? I see no mathematical proof of this.

Some LLM's are indeed trained on corpuses of "ground truth", that is, a set of assertions about the world that has been created and checked by experts, such as textbooks, academic papers, case law, etc. I'd include Wikipedia, except that a lot of academics have a low opinion about Wikipedia, too (which in my opinion is unjustified). Ground truth also finds a role in fine-tuning of models.

But more importantly, I think the emphasis on "ground truth" as an important foundation of knowledge is misplaced. To name just one example, people functioned for hundreds of thousands of years believing that the world was flat. This was their "ground truth", no pun intended. The fact that it is not correct didn't mean that they were not intelligent, that they didn't have minds, or that their reasoning was fundamentally flawed; they were still able to live their lives and exhibit intelligent behavior such as navigating successfully in unfamiliar terrain, and travelling long distances by sea.

The same thing applies, with less time, to things like the belief that the speed of light is instantaneous. So I think the criticism that there is no "ground truth" for LLM's does not, once again, amount to a really substantive criticism. To turn it into one, you would have to have a convincing reason why a machine that thinks has to have some particular ground truth in order to be considered intelligent. And if it does need this, how much ground truth is needed? Quantify it.

#8: "Not only do LLMs sometimes fabricate incorrect answers, they also obscure the information sourcing—the blue links—that are part and parcel of traditional search."

Another example of a blanket assertion about LLM's that fails, for example, on models like o3.

#9: "But when an LLM is the author, there is no mind there for a reader to glimpse."

Another example of "not even wrong", which rests on the vagueness of "mind". Given a computer system, what method would we use to decide whether it has a "mind" or not? Is it really a black-and-white quantity? Do only people have minds? How about animals? Where is the line you draw?

I would argue that not only is it is reasonable to say that there a "mind" in some LLM's, it is actually a representation of pieces of thousands or millions of minds. I think that is a much more useful way to think about them.

There is now a 60-year history or more of people making assertions that machines "don't have minds", "don't really think" and "aren't really intelligent". With each new advance in AI, instead of saying, "oh, maybe we were wrong, maybe they really do think after all", the consistent response has been, "sigh, I guess you don't need real intelligence to be able to do X after all". Witness Doug Hofstadter and his remarks about chess after Deep Blue. There is also a 60-year history of people making firm predictions that "machines will never be able to" do a variety of things, including play chess, play GO, play ping-pong, transcribe human language, translate human language, write a good song, paint a good painting, and so forth. The fact that all of these predictions have now proved false should give everyone pause on making similar assertions.

Monday, January 02, 2023

Yet Another Crappy Analysis of AI

Whenever some new breakthrough in AI (artificial intelligence) is made, you can depend on ignorant naysayers to break out the usual clichés.

We will be told that this program or that one isn't "really" creative (but no testable definition of `creative' will be given). We will be told that some new system doesn't "really" understand what you are saying (but no testable definition of `understand' will be forthcoming). We will be told that computers cannot have "minds" (but no test for having or not having a "mind" will be provided). We will be told that computers aren't "really" reading and understanding what they translate, even though no definition of what "really reading and understanding" will be given.

So it doesn't surprise me at all that there has been an enormous amount of ignorant nonsense written about ChaptGPT and related models. This dreck from Gary N. Smith, a professor of economics at Pomona College, is pretty typical for the genre.

Smith claims that "while GPT-3 can string words together in convincing ways, it has no idea what the words mean". Yet he does not provide any testable definition of "knowing what words mean". Does it really have no idea? I conducted a little experiment of my own, by asking ChatGPT, "What is a cow?" And here is the response:

A cow is a large, hoofed animal that is often kept on a farm for its milk, meat, and other products. 
Cows are part of the bovine family and are known for their docile nature and distinctive mooing 
sound. They are typically reddish-brown or black in color and have long horns on their head. 
Cows are used for a variety of purposes, including the production of milk, cheese, butter, and 
other dairy products. They are also raised for meat, leather, and other products. Cows are 
important sources of food and other resources for people around the world.
Does that really look to you like ChatGPT doesn't know what the word "cow" means?

So ChatGPT does indeed seem to define words. But can it reason correctly about them? So I asked, "Can a cow fly over the moon?" and ChatGPT responded as follows:

No, cows are not able to fly over the moon or any other celestial body. Cows are terrestrial 
animals and do not have the ability to fly through the air. They are able to walk and run on 
land, but they do not have wings or other body parts that allow them to fly. In fact, most 
animals are not able to fly, and those that are able to fly typically have specific adaptations, 
such as wings, that allow them to do so.
I was not satisfied with this, so I continued to ask, "Then how did the cow fly over the moon in "Hey Diddle Diddle" if it is impossible?" And here is how ChatGPT responded:
"Hey Diddle Diddle" is a popular nursery rhyme that has been passed down through the 
generations. It is a simple, nonsensical poem that is meant to be fun and entertaining, 
rather than literally true. The rhyme tells the story of a cow jumping over the moon, 
but it is not intended to be taken literally. Cows do not have the ability to fly, and 
the rhyme is not meant to be taken as a factual account of events. It is simply a 
lighthearted, playful rhyme that has been enjoyed by children and adults for many years.
Seems pretty accurate to me.

Smith justifies his claim by giving an example of a conversation about how many pennies there are in each group if you start with 47 pennies, you throw away 44, and divide the remainder into equal-size groups. In his example, ChatGPT gave the wrong answer and incorrect reasoning to get there. But that conversation isn't any evidence at all that ChatGPT didn't know the meaning of the words "pennies" or "group" or "47". If anything, it shows it was unable to do logical problem-solving in that particular domain, but that's a completely separate issue from knowing the meaning of words.

There's nothing easier than taking a chatbot and getting it to produce something nonsensical. After that, you can just claim it doesn't "really" understand English. But human speakers of English produce a flood of nonsense every day, as anyone reading Salon can verify. Does that mean that the writers don't "really" understand English?

I tried Smith's example, and here is what I got:

If you start with 47 pennies and throw away 44 of them, you are left with 47-44=3 pennies.
If you divide these 3 pennies into three groups, each group will have 3/3=1 penny.
So each group will have 1 penny.
Looks pretty good to me! So can we conclude that ChatGPT really does understand the meaning of words now?

I wish the people commenting on AI would demonstrate a little more seriousness about what it means to "really" understand utterances. Or any realization that "understanding" is not a binary; one can have "understanding" to a greater or lesser degree. Does a five-year-old understand the nuances of a word like "duty" the way a retired member of the military does?

I think the right way to view ChatGPT is that it does appear to exhibit some understanding of what words mean, in the sense of apparently being able to define them, use them correctly in context in many conversations, and reason correctly about them. It doesn't mean that everything ChatGPT says is correct, nor that it holds no incorrect beliefs.

I am sure we will see many more crappy pieces like Smith's in the near future.

Friday, September 10, 2021

The Robert Marks Evasion: 7-year anniversary

Well, it seems so soon, but the 7-year anniversary of Robert Marks's complete failure to provide any evidence for his claims about information is upon us.

You may remember it was back on September 9 2014 that I first asked the illustrious Marks for some calculation justifying the following claim of his:

  Dear Prof. Marks:

Here

http://humanevents.com/2014/08/19/biological-information-new-perspectives-from-intelligent-design/

you claimed

"we all agree that a picture of Mount Rushmore with the busts of four US Presidents contains more information than a picture of Mount Fuji".

I asked you there for the details of the calculation that would show this, but you did not reply on that page, so I'm asking again.

Could you please provide me with your calculation to justify this claim?

Regards,

Jeffrey Shallit

  
I asked again after 3 months, 6 months, 1 year, 2 years, 3 years, 4 years, 5 years, 6 years, and now 7 years.

Still no response.

That is the nature of intelligent design creationism. Lots of wild claims, lots of bluster, but no actual evidence.

Science deserves better than this kind of nonsense. A lot better.

Saturday, February 27, 2021

The Fake "Science" of Intelligent Design

The fake "science" of intelligent design claims to provide a reliable methodology for determining if something was "natural" or created through the intervention of an intelligent agent.

Yet somehow its proponents never actually apply it to cases of genuine interest, like this one.

Why is that?

Thursday, September 10, 2020

Robert Marks: The Six-Year Anniversary

Six years ago, the illustrious Baylor professor Robert Marks II made the following claim: "we all agree that a picture of Mount Rushmore with the busts of four US Presidents contains more information than a picture of Mount Fuji".

I didn't agree, so I asked the illustrious Marks for a calculation or other rationale supporting this claim.

After three months, no reply. So I asked again.

After six months, no reply. So I asked again.

After one year, no reply. So I asked again.

After two years, no reply. So I asked again.

After three years, no reply. So I asked again.

After four years, no reply. So I asked again.

After five years, no reply. So I asked again.

Now it's been SIX years. I asked again. Still no reply from the illustrious Marks.

This is typical behavior for advocates of intelligent design. They do not feel any scholarly obligation to produce evidence for their claims. That's one way you know that intelligent design is pseudoscience.

Friday, March 13, 2020

Yet More Unsubstantiated Claims by Egnor


I realize that most people have better things to do currently than read the ravings of a creationist neurosurgeon, but Michael Egnor is at it again, making completely unsubstantiated claims about the minds of people and animals.

Here he claims that "abstract thought (as classical philosophers pointed out) is inherently an immaterial ability and thus it cannot arise from the brain or from any material organ". Actually, there's no evidence at all for this claim. As far as we know, abstract thought is no different from any kind of brain activity, carried out by our neurons and synapses. And if it does not "arise from the brain", what could it possibly arise from?

Abstract reasoning is actually not significantly different from any other kind of reasoning, a point of view espoused for the specific case of mathematical reasoning by George Lakoff and Rafael Nunez in their book Where Mathematics Come From: How The Embodied Mind Brings Mathematics Into Being.

Egnor claims that "Mental activity always has meaning—every thought is about something. Computation always lacks meaning in itself." This is a classic blunder, made by people who have little understanding of the nature of computation. Of course computations have meaning. When we sum the infinite series 1+1/4+1/9+... using a program such as Maple, by typing sum(1/n^2,n=1..infinity); who can reasonably deny that the answer Π2/6 it produces has meaning? This classic error was debunked as long ago as 1843, when Ada Lovelace wrote, "Many persons who are not conversant with mathematical studies, imagine that because the business of the engine is to give its results in numerical notation, the nature of its processes must consequently be arithmetical and numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine its numerical quantities exactly as if they were letters or any other general symbols; and in fact it might bring out its results in algebraical notation, were provisions made accordingly." This is an abstract example, but if you want examples related to the real world, just consider the data collected and processed to produce weather predictions. If these computations had no meaning, how is it that short-term weather forecasts are so accurate?

Egnor goes on to justify his bogus claim by saying, "A word processing program doesn't care about the opinion that you’re expressing when you use it." But what does this have to do with anything? A secretary that types up letters also probably doesn't care about the content of the letters the boss dictates; does this mean he/she has no mind? How did we get from "meaning" to "caring"? It's a huge non sequitur that Egnor doesn't bother to explain.

In another screed, Egnor repeats for the n'th time his bogus claims about the minds of animals. He writes, "No animal (except man) can do statistics, because statistical reasoning is abstract and only human beings are capable of abstract thought." But, as usual, he ignores the evidence against his claim, and provides not a shred of evidence in favor of it. All he does is assert. (Three links: one, two, three. I can produce many more.)

He closes with this, which is one of the least self-aware claims I've ever seen: "Only human beings can reason abstractly because only human beings have rational souls. Rational souls have an immaterial aspect—a spiritual aspect, because we are created in the Image of our Creator, who is a Spirit. That's a scientific inference."

No, that's just religious babble.