Wednesday, March 23, 2011

Nobody - Even Creationists - Seems to Know How To Calculate Dembski's "CSI"

Back in 2001, when I was on sabbatical in Tucson, Arizona, I decided to spend some time trying to understand Dembski's "complex specified information" (CSI) to see if there was anything to it. The result was my long paper with Elsberry, where we concluded that CSI was a hopeless, incoherent mess that didn't have the properties Dembski claimed. A shorter version of the paper has recently appeared in Synthese.

Now, over on Uncommon Descent, there is an amusing thread which demonstrates our conclusion. Nobody, not even the creationists, can seemingly agree on the most simple assertions about CSI. That's because it's a hopeless, incoherent mess.

18 comments:

Larry Tanner said...

And, as I ask: Where's the 'Ski?

Anonymous said...

I am tempted to leave a comment over at UD congratulating them on actually discussing ID, as distinct from religion, theology, the soul, religion ....

They are obviously getting nowhere, and I wonder if they can see that, but at least its a break from religion, religion ...

Jeffo said...

I applaud Mathgrrl for trying to uphold a standard in the thread, and not getting baited. It's remarkable how many participants don't seem to understand even what constitutes a coherent rational discussion of a particular topic. She solicits the participants to explain CSI calculation for the specific examples (focusing on the first one, I guess because it is the most accessible), and gives credible rational for the question from Dembski's own text. The only mildly relevant responses fail to meet some of Dembski's own basic criteria (such as that object history and origin are not to be a factor). The amount of misdirection and bad faith in the responses is just astounding.

Les Lane said...

The difficulty is that biological information is not specified. What is specified is a process or structure. (Vastly) Many different types of information can normally encode a single process.

Dave S said...

"Nobody, not even the creationists, can seemingly agree on the most simple assertions about CSI. That's because ... "

Surely having someone as contentious as Joe "Ya see" G as part of the discussion doesn't help the matter.

Jeffo said...

And the 173rd comment is finally an explicit calculation! It will be interesting to see Mathgrrl's response.

KeithB said...

Except #173 is just parroting Dembski's calculation of the CSI of the flagellum. It has nothing to do with MathGrrl's question, or whether any calculation of CSI has to be run past Dembski for approval.

Der Hammerman said...

Have you ever noticed how Dembski doesn't even look like he doesn't believe what he's saying?

Argon said...

Oh His Merciful Noodlelyness!
Let's see: Duplication isn't SI increasing because it doesn't take into account the origin of the original sequence... Duplication doesn't add SI because it only increases the quantity of a protein and CSI is about 'quality'. You can't say anything about the change in SI because you need to know all the possible, allowed configurations...

But regardless, even though we won't put our necks out and suggest a calculation method, we're all real sure that CSI wouldn't be higher in a cell if a duplication resulted in increased protein expression.

Ha! I like Salvadore's little blurb about how great he is.

Chris P said...

As I asked Dougy Groothuis, it's all very well having an intelligent designer but that isn't the whole process. Things have to be made.

Who is the manufacturer? Or is that magic too?

Doug wasn't interested in this at all, which was most surprising. You'd have thought they would at least have had some thoughts about the whole mechanism.

Tatarize said...

Because everybody understands microevolution is like walking a few feet going "right foot, left foot" and you can look at the macroevolution version like walking a thousand miles going "right foot, left foot" and know that it's absolutely impossible to do!

So somewhere between 2 and 5 million feet there's a boundary where "right foot/left foot" goes from being hard to being magical. It's just not an easy point to properly calculate.

Larry said...

Why do these people refuse to admit their mistakes and take on board what people who actually understand these fields tell them?

I just flicked through the series of Stephen Meyer's talks linked to on Homologous Legs, and cross-linked on Panda's Thumb, and he just repeats the same bogus, repeatedly refuted claims over and over again. At one point during the third episode he repeats the discredited claim that "Information always comes from inteliigence." This exact phrase is then presented on the screen and the host, John Ankerberg, emphasizes the importance of the word "always." Yet, we know this claim to be false.

andrew said...

I once designed a series of experiments that the ID-folks ought to be doing to prove their so-called point--that is, assuming they had one and weren't lying for Jesus. My favorite of said experiments regarded language. Since Luskin claimed that sentences have CSI whereas random strings of letters do not, then they ought to be able to tell solely on the basis of CSI which letter strings are sentences and which are not. In other words, they ought to be able to tell which of the follow letter strings is a real sentence:

агдлнвебгещщртнльуеибжрмжшкрахряжкйксхуепдбйбъслауь

кейсилъскинелъжецитъпанаркойтонеразбираотинформация

and when you put it that way, the basic absurdity of it becomes apparent, as well as the dishonesty of their conflating understanding of the Latin alphabet and the English language with a vindication of their bogus 'theory'.

Dave S said...

Andrew, if I had to pick, the second one has a letter frequency distribution that matches the typical pattern of vowels and consonants.

andrew said...

quite right. but you cheated. :P

i actually posted this over at uncommon descent, but i phrased it rather more politely and used numbers instead of letters to get around the consonant-vowel ratio problem. ideally one would use something exotic like Hebrew for maximum effect.

i also specifically said 'no cheating'. for all the good that will do lol.

so far though, several hours have gone by and we're still 'awaiting moderation'.

Blake Stacey said...

This post reminded me about your paper with Elsberry, and while rereading that paper, I dereferenced pointers back to your article with Ming-Wei Wang on "automatic complexity of strings". I have a question about your complexity measure A(x).

Shannon entropy satisfies a strong subaddivity property, that

H(XY) <= H(X) + H(Y) - H(XY) (1)

for sets of random variables X and Y. The quantum information theory people make a big to-do about the fact that von Neumann entropy satisfies an analogous rule; up to the logarithmic terms which show up all over algorithmic information theory, an analogous statement holds for Kolmogorov complexity as well, I believe (per Theorem 3.5 in Grünwald and Vitányi, arXiv:cs/0410002). However, Theorem 13 in Shallit and Wang seems to say that strong subaddivity will, in general, fail for the automatic complexity A(x), as the left-hand side of (1) could be O(√n}) while the right-hand side is only O(1).

Being a physicist by training, I'm really remarkably bad at thinking about deterministic finite automata, so I would most definitely appreciate any comments you might have on this point.

[Attempted to ask over e-mail; got "greylist" bounce from graceland.math.uwaterloo.ca; figured I'd ask here]

Jeffrey Shallit said...

It depends on your interpretation of "intersection". But yes, some version of subadditivity will fail for our measure.

Blake Stacey said...

What I was thinking was something like this:

Say we have a collection of random variables, call them X_1 through X_N, which are described by the joint probability distribution p(X_1,...,X_N). Maybe the distribution factors nicely, meaning that the random variables are independent, or maybe it doesn't; there could be nontrivial mutual informations between variables, or in principle even correlations of higher order. The Shannon entropy of some subset {X_a} is defined from the marginal probability distribution found by integrating out the other random variables, and by the nature of Shannon entropy, various constraints will hold among the amounts of information contained in the different subsets of random variables.

Now, if we want an algorithmic analogue of this so that we can consider it in a single-shot way, as with Vitanyi's "algorithmic statistics", it seems the natural thing to do is take a set of N bitstrings, call them S_1 through S_N. Then the Kolmogorov information of the various subsets {S_a} satisfy relations analogous to those found in the Shannon case, provided we have some function 〈•,•〉 which codes two strings onto the same tape in a recoverable way, so that we can define

K(S, T) = K(〈S, T〉),

and so on for the Kolmogorov information of larger sets of strings. So, "union" and "intersection" are just the set-theoretic union and intersection of sets of the fundamental objects we're considering, whether those objects are random variables or constant bitstrings.