Sunday, July 13, 2025

Yet Another Bad Analysis of AI

In the past I've commented on bad discussions of thinking, intelligence, brains, and computers, such as those by Gary N. Smith, Doug Hofstadter, Arthur Gardner, and Robert Epstein.

With the rise of LLM's like ChatGPT, the commentary has only gotten worse. We see the same kinds of mistakes and bad argumentation tactics that have been used for decades. Here are some of them.

  1. Commenters will always say things like "It looks like an LLM is thinking, but it's not really thinking." But they'll never explain what the difference is between "thinking" and "really thinking".
  2. Commenters will always say things like "It's not really thinking, it's just X", where X is some mechanical or physical process. This is a mistake that goes back to Leibniz, and just amounts to a categorical denial that atomic computational processes can give rise to minds.
  3. Whenever a new advance in AI comes along and falsifies a prediction along the lines of "computers, lacking true intelligence, will never do X", commenters won't admit that they do have "true intelligence". They'll just say, "Oh, I guess I was wrong, doing X doesn't require true intelligence."
  4. Commenters will use vague words like "understand" and "think", but never provide any definitive tests by which we could determine whether a computational system does these actions.
  5. Commenters will always compare AI to only the most competent humans, never to the average person. For example, when ChatGPT matches or slightly exceeds average human performance on some benchmark, this will not be regarded as evidence of thought. "Their results are not always trustworthy!" people will say of LLM's, ignoring the fact that neither are people, encyclopedias, or any other source of information we depend on.
  6. Commenters will claim that some new AI implementation lacks some quality of humans. Maybe it's that the AI is not "embodied", or "lacks feelings", or "doesn't have beliefs" or "lacks ground truth". But they won't provide any clean, direct argument that these things are needed to be think or be intelligent. It's the old "airplanes aren't really flying because they don't flap their wings like birds" fallacy. This is where a background in the theory of computation helps. One of the first things you learn is that one computational model can simulate another. Having gears or transistors or neurons aren't fundamental aspects of computation.
Here is yet another example. There is a new online course by Carl T. Bergstrom and Jevin D. West, professors at the University of Washington. These are distinguished scholars, and, as you might expect, their analysis is not as bad as some of the others I've cited above. But it is noteworthy that neither of these two professors has a background in neuroscience, the theory of computation, or machine learning.

This is not to say they are entirely wrong. Some of the things they say make sense. They talk about the limitations of some current LLM's, and they warn about the dangers of relying on current LLM's without checking their results independently. They caution that putting LLM's in charge of critical systems is a danger. They argue that use of LLM's by students may result in decreased educational gain. I agree with all these things. But they also make mistakes and they use some of the tactics mentioned above.

I will quote some passages from the current (July 12 2025) version of their online course, followed by my commentary.

#1: "Given a string of words, you guessed the next one in the sequence. This is basically all that ChatGPT and other LLMs are doing."

This is not true, or at best, an oversimplification, and it illustrates one my main criticisms. They make assertions about LLM's that are meant to apply to all such models, but they don't. For example, o3 can do much more than simply "guessing the next one". It appears to do step-by-step reasoning, and it can back up its claims with cited references. Furthermore, it narrates the steps of its reasoning so you can follow along.

This assertion of Bergstrom and West is also a classic example of the Leibniz mill fallacy, one that was well-satirized here.

#2: "They don’t reason the way that people do. They don’t have any sort of embodied understanding of the world." and then later on "It doesn’t ‘think’ about your question the way a person does."

Perhaps this is true, perhaps not. But they don't know with certainty, for example, that next-word prediction is not part of most human interactions; I would argue that it is! But the assertions are also misleading, because there is an implication hiding in the background, something entirely missing from their argument: namely, even if we grant their claim that LLM's don't reason like people do and don't have an "embodied understanding of the world", it doesn't follow that this is some really fundamental limitation on their capabilities. LLM's produce correct results at a rate that can't simply be explained as chance alone.

#3: "They don’t even have a fundamental sense of truth and falsehood."

This criticism fails because they don't say what it would mean for anything (a person, a machine) to have a "fundamental sense" about anything. Suppose I say in response, actually, some LLM's do indeed have a "fundamental sense of truth and falsehood". How could we test that assertion? What method would they use? I think the assertion can be judged as "not even wrong", to use a phrase of Pauli.

#4: "But don’t let the impressive capabilities of LLMs lure you into thinking that they understand human experience or are capable of logical reasoning."

Two claims, one that is simply wrong ("not capable of logical reasoning") for some LLM's and one that is not even wrong, namely that they can't "understand human experience". The reason why it is "not even wrong" is that they offer no definition of "understand" for which we could carry out a test. "Understand" is such a vague word that one can make such assertions and not back them up.

#5: "It's not intelligent. It doesn't understand anything. It doesn't think."

Yet more assertions made without any supporting evidence. And assertions using words with vague, complicated, and multifaceted meanings like "intelligent", "understand", "think".

I would argue that, in any reasonable understanding of these words, some LLM's are intelligent. It is reasonable to say that many of them do understand things, and that many of them do indeed think. The fact that Bergstrom and West themselves use these words and then characterize this use as "fall[ing] in[to] a trap" illustrates that the colloquial use of these words to describe what LLM's do is, in fact, quite reasonable.

But if we wish to argue about these things, I think there is a need to provide some definitive tests by which we could decide,

* is this particular LLM intelligent? How intelligent?
* does this particular LLM understand anything?
* does this particular LLM think?

Another problem with these assertions is that they seem to suggest that, for example, something is either intelligent or it is not. This is the kind of black-and-white thinking that pervades so much of the discussion about thinking machines. I have criticized this in detail in a number of my blog posts here. Why could it not be that LLM's display an intelligence that perhaps does not yet match humans in some areas, but outperforms them in others? Why must intelligence be reduced to one single capability? Why do we not measure it on a continuum? Intelligence is multifaceted and many people have written about the multiple kinds of intelligence.

#6: "Teddy: Should I trust them?

Carl: No.

I think you can use them to suggest things to try, suggest questions to ask, suggest things to research, but you can't trust them to give you correct answers."

This is interesting for me because it illustrates yet another way there is an implication behind the words. Here the implied syllogism seems to be "because the answers LLM's can't be trusted to give correct answers all the time, they can't be trusted or useful at all". And yet exactly the same thing is true of people, encyclopedias, dictionaries, newspaper articles, books, scientific journals, and all sorts of other ways we gain an understanding of the world! What academic does not have a colleague that makes a pronouncements with confidence on a broad range of issues in and outside their own competence? Such a colleague is very often right, but sometimes wildly wrong. Should we then condemn all academia in the same way? We always have the same problem: we need to check the answers, whether provided by an LLM or a person.

Whenever you come up with a criticism of LLM's that is also applicable in exactly the same way to people, then you're not really criticizing LLM's at all. You're just faced with a fundamental problem of epistemology.

#7: "These systems have no ground truth, no underlying model of the world, and no rules of logic."

Well, once again, a combination of what I would consider false statements and statements that are "not even wrong". No underlying model of the world? Of course they do. They have a model of the world based on reading literally millions of texts. If we were communicating with aliens light-years away, their understanding, their model of our world would be similarly based. No rules of logic? That's another example of a claim made but not supported. How do you know that a prediction model can't result in reasoning that is logic-based? I see no mathematical proof of this.

Some LLM's are indeed trained on corpuses of "ground truth", that is, a set of assertions about the world that has been created and checked by experts, such as textbooks, academic papers, case law, etc. I'd include Wikipedia, except that a lot of academics have a low opinion about Wikipedia, too (which in my opinion is unjustified). Ground truth also finds a role in fine-tuning of models.

But more importantly, I think the emphasis on "ground truth" as an important foundation of knowledge is misplaced. To name just one example, people functioned for hundreds of thousands of years believing that the world was flat. This was their "ground truth", no pun intended. The fact that it is not correct didn't mean that they were not intelligent, that they didn't have minds, or that their reasoning was fundamentally flawed; they were still able to live their lives and exhibit intelligent behavior such as navigating successfully in unfamiliar terrain, and travelling long distances by sea.

The same thing applies, with less time, to things like the belief that the speed of light is instantaneous. So I think the criticism that there is no "ground truth" for LLM's does not, once again, amount to a really substantive criticism. To turn it into one, you would have to have a convincing reason why a machine that thinks has to have some particular ground truth in order to be considered intelligent. And if it does need this, how much ground truth is needed? Quantify it.

#8: "Not only do LLMs sometimes fabricate incorrect answers, they also obscure the information sourcing—the blue links—that are part and parcel of traditional search."

Another example of a blanket assertion about LLM's that fails, for example, on models like o3.

#9: "But when an LLM is the author, there is no mind there for a reader to glimpse."

Another example of "not even wrong", which rests on the vagueness of "mind". Given a computer system, what method would we use to decide whether it has a "mind" or not? Is it really a black-and-white quantity? Do only people have minds? How about animals? Where is the line you draw?

I would argue that not only is it is reasonable to say that there a "mind" in some LLM's, it is actually a representation of pieces of thousands or millions of minds. I think that is a much more useful way to think about them.

There is now a 60-year history or more of people making assertions that machines "don't have minds", "don't really think" and "aren't really intelligent". With each new advance in AI, instead of saying, "oh, maybe we were wrong, maybe they really do think after all", the consistent response has been, "sigh, I guess you don't need real intelligence to be able to do X after all". Witness Doug Hofstadter and his remarks about chess after Deep Blue. There is also a 60-year history of people making firm predictions that "machines will never be able to" do a variety of things, including play chess, play GO, play ping-pong, transcribe human language, translate human language, write a good song, paint a good painting, and so forth. The fact that all of these predictions have now proved false should give everyone pause on making similar assertions.

Monday, January 02, 2023

Yet Another Crappy Analysis of AI

Whenever some new breakthrough in AI (artificial intelligence) is made, you can depend on ignorant naysayers to break out the usual clichés.

We will be told that this program or that one isn't "really" creative (but no testable definition of `creative' will be given). We will be told that some new system doesn't "really" understand what you are saying (but no testable definition of `understand' will be forthcoming). We will be told that computers cannot have "minds" (but no test for having or not having a "mind" will be provided). We will be told that computers aren't "really" reading and understanding what they translate, even though no definition of what "really reading and understanding" will be given.

So it doesn't surprise me at all that there has been an enormous amount of ignorant nonsense written about ChaptGPT and related models. This dreck from Gary N. Smith, a professor of economics at Pomona College, is pretty typical for the genre.

Smith claims that "while GPT-3 can string words together in convincing ways, it has no idea what the words mean". Yet he does not provide any testable definition of "knowing what words mean". Does it really have no idea? I conducted a little experiment of my own, by asking ChatGPT, "What is a cow?" And here is the response:

A cow is a large, hoofed animal that is often kept on a farm for its milk, meat, and other products. 
Cows are part of the bovine family and are known for their docile nature and distinctive mooing 
sound. They are typically reddish-brown or black in color and have long horns on their head. 
Cows are used for a variety of purposes, including the production of milk, cheese, butter, and 
other dairy products. They are also raised for meat, leather, and other products. Cows are 
important sources of food and other resources for people around the world.
Does that really look to you like ChatGPT doesn't know what the word "cow" means?

So ChatGPT does indeed seem to define words. But can it reason correctly about them? So I asked, "Can a cow fly over the moon?" and ChatGPT responded as follows:

No, cows are not able to fly over the moon or any other celestial body. Cows are terrestrial 
animals and do not have the ability to fly through the air. They are able to walk and run on 
land, but they do not have wings or other body parts that allow them to fly. In fact, most 
animals are not able to fly, and those that are able to fly typically have specific adaptations, 
such as wings, that allow them to do so.
I was not satisfied with this, so I continued to ask, "Then how did the cow fly over the moon in "Hey Diddle Diddle" if it is impossible?" And here is how ChatGPT responded:
"Hey Diddle Diddle" is a popular nursery rhyme that has been passed down through the 
generations. It is a simple, nonsensical poem that is meant to be fun and entertaining, 
rather than literally true. The rhyme tells the story of a cow jumping over the moon, 
but it is not intended to be taken literally. Cows do not have the ability to fly, and 
the rhyme is not meant to be taken as a factual account of events. It is simply a 
lighthearted, playful rhyme that has been enjoyed by children and adults for many years.
Seems pretty accurate to me.

Smith justifies his claim by giving an example of a conversation about how many pennies there are in each group if you start with 47 pennies, you throw away 44, and divide the remainder into equal-size groups. In his example, ChatGPT gave the wrong answer and incorrect reasoning to get there. But that conversation isn't any evidence at all that ChatGPT didn't know the meaning of the words "pennies" or "group" or "47". If anything, it shows it was unable to do logical problem-solving in that particular domain, but that's a completely separate issue from knowing the meaning of words.

There's nothing easier than taking a chatbot and getting it to produce something nonsensical. After that, you can just claim it doesn't "really" understand English. But human speakers of English produce a flood of nonsense every day, as anyone reading Salon can verify. Does that mean that the writers don't "really" understand English?

I tried Smith's example, and here is what I got:

If you start with 47 pennies and throw away 44 of them, you are left with 47-44=3 pennies.
If you divide these 3 pennies into three groups, each group will have 3/3=1 penny.
So each group will have 1 penny.
Looks pretty good to me! So can we conclude that ChatGPT really does understand the meaning of words now?

I wish the people commenting on AI would demonstrate a little more seriousness about what it means to "really" understand utterances. Or any realization that "understanding" is not a binary; one can have "understanding" to a greater or lesser degree. Does a five-year-old understand the nuances of a word like "duty" the way a retired member of the military does?

I think the right way to view ChatGPT is that it does appear to exhibit some understanding of what words mean, in the sense of apparently being able to define them, use them correctly in context in many conversations, and reason correctly about them. It doesn't mean that everything ChatGPT says is correct, nor that it holds no incorrect beliefs.

I am sure we will see many more crappy pieces like Smith's in the near future.

Friday, September 10, 2021

The Robert Marks Evasion: 7-year anniversary

Well, it seems so soon, but the 7-year anniversary of Robert Marks's complete failure to provide any evidence for his claims about information is upon us.

You may remember it was back on September 9 2014 that I first asked the illustrious Marks for some calculation justifying the following claim of his:

  Dear Prof. Marks:

Here

http://humanevents.com/2014/08/19/biological-information-new-perspectives-from-intelligent-design/

you claimed

"we all agree that a picture of Mount Rushmore with the busts of four US Presidents contains more information than a picture of Mount Fuji".

I asked you there for the details of the calculation that would show this, but you did not reply on that page, so I'm asking again.

Could you please provide me with your calculation to justify this claim?

Regards,

Jeffrey Shallit

  
I asked again after 3 months, 6 months, 1 year, 2 years, 3 years, 4 years, 5 years, 6 years, and now 7 years.

Still no response.

That is the nature of intelligent design creationism. Lots of wild claims, lots of bluster, but no actual evidence.

Science deserves better than this kind of nonsense. A lot better.

Saturday, February 27, 2021

The Fake "Science" of Intelligent Design

The fake "science" of intelligent design claims to provide a reliable methodology for determining if something was "natural" or created through the intervention of an intelligent agent.

Yet somehow its proponents never actually apply it to cases of genuine interest, like this one.

Why is that?

Thursday, September 10, 2020

Robert Marks: The Six-Year Anniversary

Six years ago, the illustrious Baylor professor Robert Marks II made the following claim: "we all agree that a picture of Mount Rushmore with the busts of four US Presidents contains more information than a picture of Mount Fuji".

I didn't agree, so I asked the illustrious Marks for a calculation or other rationale supporting this claim.

After three months, no reply. So I asked again.

After six months, no reply. So I asked again.

After one year, no reply. So I asked again.

After two years, no reply. So I asked again.

After three years, no reply. So I asked again.

After four years, no reply. So I asked again.

After five years, no reply. So I asked again.

Now it's been SIX years. I asked again. Still no reply from the illustrious Marks.

This is typical behavior for advocates of intelligent design. They do not feel any scholarly obligation to produce evidence for their claims. That's one way you know that intelligent design is pseudoscience.

Friday, March 13, 2020

Yet More Unsubstantiated Claims by Egnor


I realize that most people have better things to do currently than read the ravings of a creationist neurosurgeon, but Michael Egnor is at it again, making completely unsubstantiated claims about the minds of people and animals.

Here he claims that "abstract thought (as classical philosophers pointed out) is inherently an immaterial ability and thus it cannot arise from the brain or from any material organ". Actually, there's no evidence at all for this claim. As far as we know, abstract thought is no different from any kind of brain activity, carried out by our neurons and synapses. And if it does not "arise from the brain", what could it possibly arise from?

Abstract reasoning is actually not significantly different from any other kind of reasoning, a point of view espoused for the specific case of mathematical reasoning by George Lakoff and Rafael Nunez in their book Where Mathematics Come From: How The Embodied Mind Brings Mathematics Into Being.

Egnor claims that "Mental activity always has meaning—every thought is about something. Computation always lacks meaning in itself." This is a classic blunder, made by people who have little understanding of the nature of computation. Of course computations have meaning. When we sum the infinite series 1+1/4+1/9+... using a program such as Maple, by typing sum(1/n^2,n=1..infinity); who can reasonably deny that the answer Π2/6 it produces has meaning? This classic error was debunked as long ago as 1843, when Ada Lovelace wrote, "Many persons who are not conversant with mathematical studies, imagine that because the business of the engine is to give its results in numerical notation, the nature of its processes must consequently be arithmetical and numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine its numerical quantities exactly as if they were letters or any other general symbols; and in fact it might bring out its results in algebraical notation, were provisions made accordingly." This is an abstract example, but if you want examples related to the real world, just consider the data collected and processed to produce weather predictions. If these computations had no meaning, how is it that short-term weather forecasts are so accurate?

Egnor goes on to justify his bogus claim by saying, "A word processing program doesn't care about the opinion that you’re expressing when you use it." But what does this have to do with anything? A secretary that types up letters also probably doesn't care about the content of the letters the boss dictates; does this mean he/she has no mind? How did we get from "meaning" to "caring"? It's a huge non sequitur that Egnor doesn't bother to explain.

In another screed, Egnor repeats for the n'th time his bogus claims about the minds of animals. He writes, "No animal (except man) can do statistics, because statistical reasoning is abstract and only human beings are capable of abstract thought." But, as usual, he ignores the evidence against his claim, and provides not a shred of evidence in favor of it. All he does is assert. (Three links: one, two, three. I can produce many more.)

He closes with this, which is one of the least self-aware claims I've ever seen: "Only human beings can reason abstractly because only human beings have rational souls. Rational souls have an immaterial aspect—a spiritual aspect, because we are created in the Image of our Creator, who is a Spirit. That's a scientific inference."

No, that's just religious babble.

Monday, October 14, 2019

Robert George on Mill and Newman


Every so often, the Catholic Church goes through the bizarre process of elevating one of its adherents to the status of saint. This absurd spectacle demands that the wannabee-saint be responsible for at least two miracles. Now it's the turn of John Henry Newman (1801-1890), a British theologian. For some inexplicable reason, although Newman died almost 130 years ago, it's only quite recently that prayers that invoked his name have had the desired effect.

Jack Sullivan supposedly had back pain, and he claims to have been cured after praying to Newman. Well, it's not like spontaneous remission of back pain ever happens, right? It must have been a miracle!

Melissa Villalobos supposedly had internal bleeding while pregnant. She also prayed to Newman, and claimed to be healed. It must have been a miracle! No one could possibly come up with any other explanation, right?

Recently on twitter, Princeton professor Robert George celebrated this momentous event by recalling his paper on John Stuart Mill and John Henry Newman. I have to admit, I am not usually in the habit of reading papers published in obscure religious journals, but I was intrigued. So I read it.

That was a mistake.

It is pretty bad. Here, very briefly, are just a few of the things wrong with it: it's sloppily proofread; it uses private redefinitions of basic terms; it doesn't so much as argue as just make assertions; it's full of bafflegab; it doesn't adequately support its main contention; and it fails to be a scholarly contribution.

Sloppy proofreading: I'll just cite two instances (there are others): "defenses f freedom" in the very first paragraph! Then, later on, "neither to each other not to some common substance" ("not" instead of "nor"). Did anyone -- author or publisher -- make even the most cursory effort here?

Makes assertions instead of argues: "Christian philosophical anthropology ... has proved to be far more plausible and reliable than the alternative that Mill, quite uncritically, accepted". No actual argument or citation provided.

Private redefinitions of basic terms: religion is defined as "the active quest for spiritual truth and the conscientious effort to live with integrity and authenticity in line with one’s best judgments regarding the ultimate sources of meaning and value, and to fulfill one’s obligations (spiritual and moral) in both the public and private dimensions of one's life". A dishonest rhetorical ploy: define "religion" so broadly it encompasses nearly every action by an ethical person.

Bafflegab: top, p. 42: George uses 17 lines to make the trivial observation that happiness and human flourishing are functions of multiple variables with no obvious way to compare or weight them, in order to achieve a maximizing outcome everyone will agree with. Then why not just say that?

More bafflegab: "the dignity of human persons" (p. 44). "Dignity" is the ultimate weasel word; what you regard as essential to human dignity (e.g., forbidding contraception) I could just as easily regard as an example of human indignity.

Very few citations: e.g., George mentions criticism of Mill by Hart (but doesn't bother to give a citation). This is not scholarly behavior.

The main point is not adequately supported: Why exactly do duties automatically confer rights? Adherents of the religion of Christian Identity believe black people are subhuman and one has a duty to subjugate and exterminate them. How does this confer a right to do so?

Let's face it: the Christian account of morality is competely unsupported and incoherent. Some philosophers still have a medieval view of man's nature that is completely unmoored from modern discoveries of evolution and psychology.

Man is not a "rational creature" as George claims, and this absurdly bad essay is proof of that. In my field, junk as bad as this just could not get published in a reputable journal, and if it does somehow manage to, everyone would laugh.