Sunday, October 04, 2009

Jonathan Wells: Another ID Creationist Who Doesn't Understand Information Theory

Intelligent design creationists love to talk about information theory, but unfortunately they rarely understand it. Jonathan Wells is the latest ID creationist to demonstrate this.

In a recent post at "Evolution News & Views" describing an event at the University of Oklahoma, Wells said, "I replied that duplicating a gene doesn’t increase information content any more than photocopying a paper increases its information content."

Wells is wrong. I frequently give this as an exercise in my classes at the University of Waterloo: Prove that if x is a string of symbols, then the Kolmogorov information in xx is greater than that in x for infinitely many strings x. Most of my students can do this one, but it looks like information expert Jonathan Wells can't.

Like many incompetent people, Wells is blissfully unaware of his incompetence. He closes by saying, "Despite all their taxpayer-funded professors and museum exhibits, despite all their threats to dismantle us and expose us as retards, the Darwinists lost."

We don't have to "expose" the intelligent design creationists as buffoons; they do it themselves whenever they open their mouths.

Tom Morris said...

The Jonathan Wells quote about the Darwinists "losing" is pretty funny. All I can think of is the Monty Python scene The Black Knight.

Harriet said...

I know nothing about information theory. I'll have to learn.

But it me, it appears that "xx" carries more information than "x" because "xx" at least means "x repeated twice" and the "repeated twice" is information, isn't it?

Joshua said...

Harriet, not necessarily. Kolmogorov complexity measures the length of a computer program needed to output a given string (assuming some fixed and reasonably well behaved notion of computer program). It turns out that there are strings where a program that outputs x is longer than a program that outputs xx. But they are rare and generally contrived. Hence the phrasing of of the problem as posed by Shallit.

Harriet said...

Wow, that is counter intuitive, at least to me. It makes me want to learn some information theory!

Gerry said...

So is it the case then that photocopying a piece of paper does increase the amount of information? And, if so, does this suggest that Kolmogorov information is not a good model for what the general public understands by the word, "information"?

Jeffrey Shallit said...

So is it the case then that photocopying a piece of paper does increase the amount of information?

In the Kolmogorov model, it may or may not. The point is that it doesn't infallibly keep the amount of information the same; there are infinitely many strings x for which xx has more Kolmogorov information.

This might seem counterintuitive, but consider that if duplicating a string didn't have the potential to increase information, then I could send an arbitrarily long message for free. For example, I could send the message n by sending the string A duplicated n times.

does this suggest that Kolmogorov information is not a good model for what the general public understands by the word, "information"?

It's true that the Kolmogorov model doesn't exactly capture what the average person means by information. But this is also a defect of every model that has been proposed.

IvanM said...

Prove that if x is a string of symbols, then the Kolmogorov information in xx is greater than that in x for infinitely many strings x.

A nice exercise, although I find myself wanting a stronger statement. Could "infinitely many" be replaced by "most"? (where "most" is understood to mean something like "the proportion of such strings of length n approaches 1 as n approaches infinity")

Jeffrey Shallit said...

Good question, Ivan. I suspect so, but I don't currently know how to prove that.

IvanM said...

I suspect so, but I don't currently know how to prove that.

Darn.

It turns out that there are strings where a [minimal] program that outputs x is longer than a [minimal] program that outputs xx.

I think I see one way of constructing such a string. (All strings are binary in what follows.)

First construct an integer N such that 2N has less information than N (integers being identified with their binary expansions). This can be done as follows: let A be a large incompressible even integer and B a large integer, and then let the binary expansion of 2N be given by repeating, B times, the binary expansion of A. The idea here is that the shortest program for 2N will be of the form "output A, B times" whereas the shortest program for N will be "output A, B times, except omit the last symbol."

Now let C be a long incompressible string and let x be given by... you guessed it... repeating C, N times. Then the minimal programs for x and xx will be similar (involving the programs for N and 2N, respectively), but the program for xx will be slightly shorter because the program for 2N is slightly shorter than the program for N.

This seems like it could be made precise.

Joshua said...

Regarding Ivan's question: I doubt there is any easy proof of such a result if it is true. Since x and xx will have close to the same total information in any reasonable programming language it isn't implausible that a suitably pathological language will shove down the program length down just enough for a lot of these strings such that they will be rare. Since languages can bump the minimal length by up to constant values this may be doable. (If that made any sense at all. I haven't studied this sort of thing in a long time).

Anonymous said...

DNA is digital information, embedded, strand-hoping, overlapping, error correcting, 3 dimensional information technology. DNA stores information (!) in 3 dimensions. It's sequences are overlapping, embedded, skip nucleotides, and use one or both strands of the molecule. DNA adheres to linguistics law. This﻿ means it's information contains a regulated number of nucleotides of a kind WITHIN sequences. All this is why DNA holds so many books worth of information.
DNA contains overlapping information. Example, if it included "Jack and Jill went up the hill to fetch a pale of water", it may also contain "Just a law for the town pillow" and "We of ale " and﻿ "Jan wept to etch." (except the information sequences may be incredibly long) in the same space! It has 3 known layers of error correction which replaces mutated genes with a backup copy.
Algorithms (problem solving procedures which require the collection and comparison of information), information (also immaterial knowledge which requires a sender and reciever and which defines work to be done) and linguistics law (conformity to ascribed character-per-declaration and communication foreknowledge) are products only﻿ of intelligence. nature can never produce any one of them. This is empirical evidence that DNA (design code of organisms), is designed. Evolutionism is not science.
All features of a cell are coded in the﻿ DNA. The cell cannot exist without the DNA, and the DNA can't exist without the cell which reproduces it. This is a chicken-and-egg problem that proves neither could exist without the other. Both must be fully-formed and present for a cell OR for DNA to exist. This proves one could not evolve at all without the other. This is creation, not evolution!
DNA has not only a primary structure (the nucleotides linked in a chain), but also a secondary, tertiary and quaternary structure! It's a molecule that is 6 ft. long, and must be supercoiled with special protiens to coil it into a﻿ supercoiled rope that is designed to be unraveled. DNA contains more information than a library - not just the printed text, but also the INFORMATION that is in that text, including ideas, concepts, facts, and inferences.
So complex is the code of DNA, that the world's best and most educated minds spend their time studying this code, and have learned almost nothing of how it functions. We have learned a lot, but what we know of DNA's instruction information is almost nothing. It is information﻿ stored and retrieved by algorithmic processes that surpass our best computer technology. It's algorithms are being discovered, and appled to many other things in science because they are superior tho those man can design.
Algorythms and information are products of intelligence only.﻿ DNA could not store information with algorythms unless it is the product of intelligent design!

Jeffrey Shallit said...

Dear Anonymous-who-is-too-timid-to-use-his-own-name:

Like where's the evidence for the claim that It's algorithms are being discovered, and appled to many other things in science because they are superior tho those man can design.

Your claim that "information...are products only﻿ of intelligence" is clearly wrong. Ever heard a weather forecast? Where do you think the information to make that forecast came from?

IRON ONE said...

Information:
1 : the communication or reception of knowledge or intelligence
2﻿ a (1) : knowledge obtained from investigation, study, or instruction

Now go find the definition of knowledge.

it is not information unless transmitted by intelligence.

Information:
1 : the communication﻿ or reception of knowledge or intelligence
2 a (1) : knowledge obtained from investigation, study, or instruction

Now go see what the definition of knowledge is.

Nature cannot produce algorithms. They are problem solving procedureas which require the comparison of information and a choice of work to be done based upon that comparison. Algorithms are also immaterial. They are produced by mind only. Molecules cannot produce that which is immaterial.﻿

I realize that most amateur evos are﻿ completely unaware of the true complexity and interdependance of DNA. So here's just a teaser for them: The most complex thing ever devised by man is the code for Microsoft Windows. DNA is far more complex that this code.

Man has not and likely will never create anything as complex as DNA and it's supporting systems. The complexity of it is, in the words of many scientists, "stupefying".

So complex is the code of DNA, that the world's best and most educated minds spend their time studying this code, and have learned almost nothing of how it functions. We have learned a lot, but what we know of DNA's instruction information is almost nothing. It is information﻿ stored and retrieved by algorithmic processes that surpass our best computer technology. It's algorithms are being discovered, and appled to many other things in science because they are superior tho those man can design.

Zipf's Law. DNA's linguistics go even beyond Zipf's law.

Statistical linguistic study of DNA sequences

A new family of compound Poisson distribution functions from quantitative linguistic is used to study the﻿ linguistic features of DNA sequences that go beyond the Zipf's law.

DNA is proof of design

Jeffrey Shallit said...

Information:
1 : the communication or reception of knowledge or intelligence
2﻿ a (1) : knowledge obtained from investigation, study, or instruction

Dear Iron:

You seem very confused. You cannot determine the technical meaning of "information" as it is used by mathematicians and computer scientists and biologists by resorting to the vague and informal definitions used by laymen and reported in dictionaries. For example, look up "group", "ring", and "field" in your dictionary and see if you get the proper mathematical definition.

"Information" as it is understood by mathematicians and computer scientists does not have anything to do with "intelligence" and does not have to come from a mind.

I repeat my challenge:

Your claim that "information...are products only﻿ of intelligence" is clearly wrong. Ever heard a weather forecast? Where do you think the information to make that forecast came from?

Takis Konstantopoulos said...

Jeff, have you ever commented on Dr John C Sanford's use (or abuse) of information theory? I'd like to find out about his concept of "genetic entropy" (but I'm lazy to read his book).

Jeffrey Shallit said...

No, I haven't read Sanford's book yet. As far as I can see the book is self-published and hence unlikely to contain anything of interest.

Ewan said...

FACT: Nobody has ever observed the origination of complex specified information via naturalistic means without intervention of a conscious agent.

http://www.ideaclubtcw.org/video/DEJohnson.html

http://scienceintegrity.net/default.aspx

Jeffrey Shallit said...

Ewan:

Hint: just because you put it in capitals doesn't make it a fact.

Go read my paper and come back when you have something interesting to say.

Jon Covey said...

Wells said, "I replied that duplicating a gene doesn’t increase information content any more than photocopying a paper increases its information content." I think Wells meant to say that duplicating such information doesn't generate novel information. He misstated his position. Your argument that x represents a string of symbols, then the Kolmogorov information in xx is greater than that in x for infinitely many strings x. In this context, I suppose Kolmogorov information can be very repetitious, increasing information content without adding other strings of information unrelated to the information contain in x.
What I mean is this: x or xx to the nth x adds a great deal of information content. The phrase “Me thinks it is a weasel,” where x represents this string, could be repeated ad infinitum. That would produce much information, but one would not increase in knowledge by reading several tomes of repetitious “Me thinks it is a weasel.” However, what Wells was attempting to communicate was that x doesn’t become xy simply by duplicating x. Duplicating x becomes xx, and duplicating that becomes xxxx. If x plus x becomes xy, where y is “The rain in Spain stays mainly in the plain,” then the content of information increases in the way evolutionists contend evolution proceeded. The series y represents information unrelated to x.
Evolutionists suppose x+x becomes x+y through a series of mutations of the second x, while the original x remains unaffected by mutations because natural selection conserves x’s information, much like the strings of information in DNA that produce essentially identical cytochrome c in yeast and man. Evolutionists assume that yeasts developed much earlier in evolutionary history than man and that natural selection conserved that string of information for cytochrome c in the common ancestor for yeast and man.
Repetitious information in DNA can be very disruptive, such as the genetic trisomies, a well known one being that of mongolism. While the information content has increased in the trisomies, the additional information is deleterious. Evolutionists argue that one way the duplicated information is allowed to freely mutate is in the case of inoperative pseudogenes (an “off-line” gene). These pseudogenes no longer come under the watchful eye of natural selection, although they may be tried for fitness from time to time as new information is hammered out in the pseudogene via mutations. In this way, pseudo-string x can become string y, which is new, operative information.

Jeffrey Shallit said...

I think Wells meant to say that duplicating such information doesn't generate novel information.

Information theory doesn't define "novel information". Creationists use the term but do not define it in any way that helps us decide what is novel and what is not.

I suppose Kolmogorov information can be very repetitious

I don't even know what this is supposed to mean. The definition of Kolmogorov information can be found in any textbook on the subject.

one would not increase in knowledge by reading several tomes of repetitious “Me thinks it is a weasel.”

Define "knowledge" and prove your claim rigorously. That's what we do in mathematics.