How long ago do you think your common ancestor with the moose lived?
Take a guess, and then check your answer here.
Recurrent thoughts about mathematics, science, politics, music, religion, and
Recurrent thoughts about mathematics, science, politics, music, religion, and
Recurrent thoughts about mathematics, science, politics, music, religion, and
Recurrent thoughts about ....
Friday, September 24, 2010
Tuesday, September 21, 2010
An Array Initialization Trick
Here is a neat trick that lets you avoid initializing large arrays. I learned about it long ago from Aho, Hopcroft, and Ullman, The Design and Analysis of Computer Algorithms, an old but still great book — but I think it is probably an old hacker's idea.
To illustrate the idea, suppose we are trying to solve the element distinctness problem on a restricted universe. In this problem we are given a list of integers and we want to determine if there are any repeated elements. The input is an array A[1..n] of integers, where each integer is between (say) 1 and M, where M is not so large compared to n — maybe M is about 10n or something like that.
The usual way to do this is to create an array B[1..M] such that B[j] = 1 if j is an element of A, and 0 otherwise. We start by initializing all the entries of B to 0. Then we loop through the elements of A. For each i, 1 ≤ i ≤ n, we first check B[A[i]]. If it equals 1, then the value A[i] already occurred in A. Otherwise, we set B[A[i]] := 1, to signify that we've seen A[i].
If M = O(n), this gives a linear-time algorithm for element distinctness. It even lets us list the repeated elements, if there are any.
Now here's the deal. Suppose we are solving element distinctness many times in a program, as a subroutine. Then it is conceivable that initializing all the entries of B to 0 could actually be a significant time drain. Although we have to allocate the memory for B once at the start, could there be a way to avoid the time-consuming Θ(M) initialization for B each time we call the subroutine again? We have to handle the problem that the entries of B could be arbitrary junk that we have no control over.
The answer is yes, using the following trick: instead of containing 1 or 0, we will set it up so that B[j] contains the position p in A where j is found. The key point is that we always actually verify this by looking at A[p], so we can never go wrong, even if B[j] is junk. More precisely, we want to ensure that if B[j] = p, then A[p] = j.
Now for each i we are going to check to see if A[i] = d has already been seen. To do this, we look in B[d]; say it equals c. If c is not between 1 and i-1, then we know that c represents uninitialized junk, so we can confidently ignore it and set B[d] = i.
If c is between 1 and i-1, then it either represents a true pointer back to the place in A where d was found earlier, or it just happens to be junk that is in the right range. In either case, we look at A[c]. Either it equals d, in which case we have found d earlier in the array at the entry A[c], or it equals something else. In the latter case we also set B[d] = i. This works whether B[d] is a true pointer or not!
Here's the code:
And that's it! This code fragment prints out the duplications. Of course, if you'd rather terminate as soon as a repeated element is found, you can do that instead. Or if you'd rather save the repeated elements in a linked list, you can do that, too. And of course, this trick can be used in other settings, such as large sparse matrices, where you want to avoid initialization costs.
To illustrate the idea, suppose we are trying to solve the element distinctness problem on a restricted universe. In this problem we are given a list of integers and we want to determine if there are any repeated elements. The input is an array A[1..n] of integers, where each integer is between (say) 1 and M, where M is not so large compared to n — maybe M is about 10n or something like that.
The usual way to do this is to create an array B[1..M] such that B[j] = 1 if j is an element of A, and 0 otherwise. We start by initializing all the entries of B to 0. Then we loop through the elements of A. For each i, 1 ≤ i ≤ n, we first check B[A[i]]. If it equals 1, then the value A[i] already occurred in A. Otherwise, we set B[A[i]] := 1, to signify that we've seen A[i].
If M = O(n), this gives a linear-time algorithm for element distinctness. It even lets us list the repeated elements, if there are any.
Now here's the deal. Suppose we are solving element distinctness many times in a program, as a subroutine. Then it is conceivable that initializing all the entries of B to 0 could actually be a significant time drain. Although we have to allocate the memory for B once at the start, could there be a way to avoid the time-consuming Θ(M) initialization for B each time we call the subroutine again? We have to handle the problem that the entries of B could be arbitrary junk that we have no control over.
The answer is yes, using the following trick: instead of containing 1 or 0, we will set it up so that B[j] contains the position p in A where j is found. The key point is that we always actually verify this by looking at A[p], so we can never go wrong, even if B[j] is junk. More precisely, we want to ensure that if B[j] = p, then A[p] = j.
Now for each i we are going to check to see if A[i] = d has already been seen. To do this, we look in B[d]; say it equals c. If c is not between 1 and i-1, then we know that c represents uninitialized junk, so we can confidently ignore it and set B[d] = i.
If c is between 1 and i-1, then it either represents a true pointer back to the place in A where d was found earlier, or it just happens to be junk that is in the right range. In either case, we look at A[c]. Either it equals d, in which case we have found d earlier in the array at the entry A[c], or it equals something else. In the latter case we also set B[d] = i. This works whether B[d] is a true pointer or not!
Here's the code:
for i := 1 to n do
d := A[i]
c := B[d]
if (c < 1) or (c ≥ i) then
B[d] := i
else if A[c] = d then
print(d, " occurred at positions ", c," and ", i)
else B[d] := i;
And that's it! This code fragment prints out the duplications. Of course, if you'd rather terminate as soon as a repeated element is found, you can do that instead. Or if you'd rather save the repeated elements in a linked list, you can do that, too. And of course, this trick can be used in other settings, such as large sparse matrices, where you want to avoid initialization costs.
Monday, September 20, 2010
Doug Groothuis's "Six Enemies of Apologetic Engagement"
Over at the creationist "Leadership University" site, Doug Groothuis has a piece called "Six Enemies of Apologetic Engagement", where he lists some ways that evangelical Christians fail to carry out their mission effectively.
It's a real hoot. "Ignorance" is one of the enemies, but Groothuis also makes the bogus claim that "macro-evolution is false, and good arguments have been raised against it from both nature and Scripture". He actually cites the vastly-ignorant Phillip Johnson -- a lawyer with no training in biology -- as someone who has made good arguments with "intellectual integrity" against evolution. (Groothuis also misspells Johnson's first name.)
I remember the time that Phillip Johnson arrived at the Usenet newsgroup talk.origins, back in 1993. He arrogantly rode in on his evangelical high horse to do battle with the godless evolutionists, confident that his rhetorical skills would hide his lack of biological knowledge. The result was not pretty at all. Johnson had to leave in a cowardly huff because he couldn't handle the criticism from people who actually knew something about the subject.
"Cowardice" and "arrogance", however, are two of Groothuis's problems with evangelicals. He says that evangelicals should "cultivate real dialogue with unbelievers". Is that the very same Doug Groothuis who routinely bans commenters at his blog who disagree with his claims? Why, I believe it is.
Doug -- you're a first-class hypocrite.
It's a real hoot. "Ignorance" is one of the enemies, but Groothuis also makes the bogus claim that "macro-evolution is false, and good arguments have been raised against it from both nature and Scripture". He actually cites the vastly-ignorant Phillip Johnson -- a lawyer with no training in biology -- as someone who has made good arguments with "intellectual integrity" against evolution. (Groothuis also misspells Johnson's first name.)
I remember the time that Phillip Johnson arrived at the Usenet newsgroup talk.origins, back in 1993. He arrogantly rode in on his evangelical high horse to do battle with the godless evolutionists, confident that his rhetorical skills would hide his lack of biological knowledge. The result was not pretty at all. Johnson had to leave in a cowardly huff because he couldn't handle the criticism from people who actually knew something about the subject.
"Cowardice" and "arrogance", however, are two of Groothuis's problems with evangelicals. He says that evangelicals should "cultivate real dialogue with unbelievers". Is that the very same Doug Groothuis who routinely bans commenters at his blog who disagree with his claims? Why, I believe it is.
Doug -- you're a first-class hypocrite.
Tuesday, September 14, 2010
A Mathematico-Religious Poem
Here's a poem by Hooper Reynolds Goodwin that was published in the journal Scripta Mathematica in 1953. I liked it very much when I was a teenager, and I still think the last two lines are excellent.
PARABOLA
This curve I'm plotting? A parabola.
This point is called the focus; it's the point—
Oh, no, not an ellipse. Ellipses have two foci:
Here, I'll show you one I've drawn.
You see the difference. These two lines of the parabola,
They stretch out wide and wider,
"World without end," as preachers say.
(I don't know what they mean; perhaps they don't);
But you see how it goes.
There was a man—Sir Isaac Newton, I believe it was—
Who had the notion a parabola was an ellipse,
Its other focus at infinity.
You may not understand just what he meant;
You have to sort of take the thing on faith.
The keenest scholar can't quite picture it, you know.
I've often thought,
It might be called a symbol of man's life;
A curve of ever-widening sweep.
And here in this world
Is the focus we may call, say, temporal interests,
Food and drink and clothes . . .
But yet it cannot be that this is all;
For out beyond the reach of sight must be
Another point, a heavenly focus, see?
'Round which the sweeping curves of human life
Complete the ellipse.
Fantastic? Well, perhaps,
But yet the more I think of it...
And here—
Another thing I've often thought about:
Suppose we draw here two parabolas
With axes parallel, and let the arms cross—
"Intersect" the word is—at this point.
Now if there be a focus
Somewhere out beyond the bounds of space,
And these are two ellipses,
As Sir Isaac thought they were,
Why, don't you see, they'll intersect again
Somewhere out there.
Just as two lives that once have crossed,
Then gone their separate ways,
And one has disappeared long since into the void of death
May—but who knows? It's just a thought...
Well, come again; I don't get callers often.
They don't see much in old folks nowadays,
And when a man's not only old, but got his head
Stuck always in a book of "AnaIyt!"
Young people think I'm queer; they can't see why
A man that doesn't have to study graphs
Should plague his head; don't understand that such
Dry, dull things as a parabolic curve
May bring up mem'ries of a face that's gone.
The author, Hooper Reynolds Goodwin (December 5 1891-October 13 1964), was born in Marblehead, Massachusetts. Raised by Unitarians, he attended Boston University and was ordained as a Methodist minister. Later he became an Episcopalian minister. He served in churches in Bethel and Randolph, Vermont.
I am very grateful to his granddaughter, Sue Maltais, for providing a photo and information about his life.
PARABOLA
This curve I'm plotting? A parabola.
This point is called the focus; it's the point—
Oh, no, not an ellipse. Ellipses have two foci:
Here, I'll show you one I've drawn.
You see the difference. These two lines of the parabola,
They stretch out wide and wider,
"World without end," as preachers say.
(I don't know what they mean; perhaps they don't);
But you see how it goes.
There was a man—Sir Isaac Newton, I believe it was—
Who had the notion a parabola was an ellipse,
Its other focus at infinity.
You may not understand just what he meant;
You have to sort of take the thing on faith.
The keenest scholar can't quite picture it, you know.
I've often thought,
It might be called a symbol of man's life;
A curve of ever-widening sweep.
And here in this world
Is the focus we may call, say, temporal interests,
Food and drink and clothes . . .
But yet it cannot be that this is all;
For out beyond the reach of sight must be
Another point, a heavenly focus, see?
'Round which the sweeping curves of human life
Complete the ellipse.
Fantastic? Well, perhaps,
But yet the more I think of it...
And here—
Another thing I've often thought about:
Suppose we draw here two parabolas
With axes parallel, and let the arms cross—
"Intersect" the word is—at this point.
Now if there be a focus
Somewhere out beyond the bounds of space,
And these are two ellipses,
As Sir Isaac thought they were,
Why, don't you see, they'll intersect again
Somewhere out there.
Just as two lives that once have crossed,
Then gone their separate ways,
And one has disappeared long since into the void of death
May—but who knows? It's just a thought...
Well, come again; I don't get callers often.
They don't see much in old folks nowadays,
And when a man's not only old, but got his head
Stuck always in a book of "AnaIyt!"
Young people think I'm queer; they can't see why
A man that doesn't have to study graphs
Should plague his head; don't understand that such
Dry, dull things as a parabolic curve
May bring up mem'ries of a face that's gone.
The author, Hooper Reynolds Goodwin (December 5 1891-October 13 1964), was born in Marblehead, Massachusetts. Raised by Unitarians, he attended Boston University and was ordained as a Methodist minister. Later he became an Episcopalian minister. He served in churches in Bethel and Randolph, Vermont.
I am very grateful to his granddaughter, Sue Maltais, for providing a photo and information about his life.
Friday, September 10, 2010
New Journal with Clunky Name
It seems every week I get an announcement for a new journal. This week it's Journal for Algebra and Number Theory Academia, which gets my nomination for the Clunkiest New Journal Title of 2010.
Wednesday, September 08, 2010
My Sabbatical is Over
My sabbatical is over and it's back to a teaching term. Classes start next week.
Here's what I'm doing today (with updates throughout the day):
5:30 AM Woke up, had breakfast, and started answering e-mail. Hey, the Phillies are in first place in their division! I worked on processing mail for the journal I edit, the Journal of Integer Sequences.
7:40 AM The kids are off to school - both of them are now in high school! (urk)
8:30 AM Arrived at work. I'll be teaching a multi-section course on algorithms. I'm teaching two sections, with 60 students each, and another instructor is teaching the third -- so there's a lot of coordination to do. I sent e-mail to the other instructor with some suggestions about the first two assignments. I found my book of notes and overhead slides from the last time I taught the course. Some things I can reuse, but other things I will revise.
8:50 AM Time for coffee! While I was gone, we got an espresso machine. I don't like the taste, though -- the coffee machine at Oberwolfach was much better. The one here at Waterloo makes coffee that's way too bitter for me.
9:00 AM Damn, the radio feed from WBUR is acting up, so listening to the news is painful. I switch to the BBC radio program "Newshour".
9:10 AM There's a really interesting article by Lionel Levine and Jim Propp in the latest issue of the Notices of the AMS, on sandpiles. I'll have to think about this stuff more when I get a chance.
9:16 AM Answered e-mail about upcoming information session on graduate study in CS.
9:28 AM Noga Alon is visiting Waterloo, so I sent him an e-mail message asking if he would have a few minutes today or tomorrow to discuss a problem.
9:30 AM Working on the first assignment for my algorithms course. It's not easy to create good, interesting problems about big-O notation. And I want problems whose solution isn't on the web! I had a good one last year but I don't want to reuse it.
10:15 AM Got a couple of problems written, but still looking for a really hard one. Time to go check my (physical) mailbox on the second floor. Most mail comes electronically these days, but still...
10:35 AM OK, I have a draft of the first problem set done. Not completely happy with it. Sent it off to the other instructor for his comments. Noga Alon says he'll be in soon. Now - time to answer some e-mail.
11:20 AM E-mail! It's the bane of my existence. I get lots of messages a day, and don't know what to do with all of them. Best is a message that requires little thought and demands an immediate response. Worst is somebody asking a question that I'm not quite sure how to answer. I don't answer and it gets buried, perhaps never to emerge again. My mailbox always has hundreds of messages and is slow to load. Wish I could get it better organized.
Eating lunch while working. I always get hungry around 11 AM and it's hard to resist eating lunch early.
11:50 AM A colleague from another country was denied a visa to visit Canada and present a paper at a conference. This is really outrageous. I've got an appointment with my MP next week to discuss this case. I'm printing out the documentation that the colleague sent me.
12:00 Noon There's a new version of the APL I use on my mac, APL X. Paid $160 Canadian for the update through paypal.
12:30 PM Spent the last half hour trying to submit a paper to Information Processing Letters. Gone are the days when you could submit a paper by e-mailing some files to the editor. Now you have to go through a web-based form where you attach files, etc. These nearly always are terrible, offering way too many options for some things and not enough for others. I've spent 30 minutes on it so far and am still not done.
12:45 PM Whew! Submission finally done.
12:50 PM Got passcode for new version of APL, and installed it. Seems to work OK.
1:00 PM A colleague once suggested trying to find examples of k-tuples of integers S where g0 (S) > g1 (S) > g2 (S), where gi (S) is the largest integer having exactly i representations as an N-linear combination of the elements of S. Ran my little APL program and found the example S = (40,46,59,61,92), where g0 = 373, g1 = 354, and g2 = 340. It would be interesting to show there are arbitrarily long descending chains like that. Do you need larger and larger k-tuples to get them?
1:20 PM Went down to the visitor's office to try to find Noga Alon, but didn't find him.
1:25 PM Worked on the algorithms course web page.
1:35 PM Off to the grad course info session to present information about my Winter 2011 course on formal languages.
1:55 PM Back from the course presentation.
2:00 PM Noga Alon kindly came by and we talked about the separating words problem. At the same time our local computer people came by and tried to figure out why Thunderbird has lots of problems with my mail. Chaos!
3:30 PM Department meeting - free cookies. We learned about a new draft policy on privacy at the university, which as currently drafted would have the unfortunate effect of preventing professors from archiving things like student project for more than one year. This would make writing recommendation letters difficult in many cases. I certainly hope this draft policy will be rewritten.
5:15 PM Department meeting's over, and we're done for the day.
Now you know what a typical non-teaching day is like.
Here's what I'm doing today (with updates throughout the day):
5:30 AM Woke up, had breakfast, and started answering e-mail. Hey, the Phillies are in first place in their division! I worked on processing mail for the journal I edit, the Journal of Integer Sequences.
7:40 AM The kids are off to school - both of them are now in high school! (urk)
8:30 AM Arrived at work. I'll be teaching a multi-section course on algorithms. I'm teaching two sections, with 60 students each, and another instructor is teaching the third -- so there's a lot of coordination to do. I sent e-mail to the other instructor with some suggestions about the first two assignments. I found my book of notes and overhead slides from the last time I taught the course. Some things I can reuse, but other things I will revise.
8:50 AM Time for coffee! While I was gone, we got an espresso machine. I don't like the taste, though -- the coffee machine at Oberwolfach was much better. The one here at Waterloo makes coffee that's way too bitter for me.
9:00 AM Damn, the radio feed from WBUR is acting up, so listening to the news is painful. I switch to the BBC radio program "Newshour".
9:10 AM There's a really interesting article by Lionel Levine and Jim Propp in the latest issue of the Notices of the AMS, on sandpiles. I'll have to think about this stuff more when I get a chance.
9:16 AM Answered e-mail about upcoming information session on graduate study in CS.
9:28 AM Noga Alon is visiting Waterloo, so I sent him an e-mail message asking if he would have a few minutes today or tomorrow to discuss a problem.
9:30 AM Working on the first assignment for my algorithms course. It's not easy to create good, interesting problems about big-O notation. And I want problems whose solution isn't on the web! I had a good one last year but I don't want to reuse it.
10:15 AM Got a couple of problems written, but still looking for a really hard one. Time to go check my (physical) mailbox on the second floor. Most mail comes electronically these days, but still...
10:35 AM OK, I have a draft of the first problem set done. Not completely happy with it. Sent it off to the other instructor for his comments. Noga Alon says he'll be in soon. Now - time to answer some e-mail.
11:20 AM E-mail! It's the bane of my existence. I get lots of messages a day, and don't know what to do with all of them. Best is a message that requires little thought and demands an immediate response. Worst is somebody asking a question that I'm not quite sure how to answer. I don't answer and it gets buried, perhaps never to emerge again. My mailbox always has hundreds of messages and is slow to load. Wish I could get it better organized.
Eating lunch while working. I always get hungry around 11 AM and it's hard to resist eating lunch early.
11:50 AM A colleague from another country was denied a visa to visit Canada and present a paper at a conference. This is really outrageous. I've got an appointment with my MP next week to discuss this case. I'm printing out the documentation that the colleague sent me.
12:00 Noon There's a new version of the APL I use on my mac, APL X. Paid $160 Canadian for the update through paypal.
12:30 PM Spent the last half hour trying to submit a paper to Information Processing Letters. Gone are the days when you could submit a paper by e-mailing some files to the editor. Now you have to go through a web-based form where you attach files, etc. These nearly always are terrible, offering way too many options for some things and not enough for others. I've spent 30 minutes on it so far and am still not done.
12:45 PM Whew! Submission finally done.
12:50 PM Got passcode for new version of APL, and installed it. Seems to work OK.
1:00 PM A colleague once suggested trying to find examples of k-tuples of integers S where g0 (S) > g1 (S) > g2 (S), where gi (S) is the largest integer having exactly i representations as an N-linear combination of the elements of S. Ran my little APL program and found the example S = (40,46,59,61,92), where g0 = 373, g1 = 354, and g2 = 340. It would be interesting to show there are arbitrarily long descending chains like that. Do you need larger and larger k-tuples to get them?
1:20 PM Went down to the visitor's office to try to find Noga Alon, but didn't find him.
1:25 PM Worked on the algorithms course web page.
1:35 PM Off to the grad course info session to present information about my Winter 2011 course on formal languages.
1:55 PM Back from the course presentation.
2:00 PM Noga Alon kindly came by and we talked about the separating words problem. At the same time our local computer people came by and tried to figure out why Thunderbird has lots of problems with my mail. Chaos!
3:30 PM Department meeting - free cookies. We learned about a new draft policy on privacy at the university, which as currently drafted would have the unfortunate effect of preventing professors from archiving things like student project for more than one year. This would make writing recommendation letters difficult in many cases. I certainly hope this draft policy will be rewritten.
5:15 PM Department meeting's over, and we're done for the day.
Now you know what a typical non-teaching day is like.
Friday, September 03, 2010
Doug Groothuis Bans Me
Wow, Doug Groothuis has banned me from his blog for writing the following, which he refused to publish:
I think this excerpt shows why, for non-Christians, C. S. Lewis's philosophy is regarded as deficient.
Lewis didn't know anything about evolution. He didn't understand that what he called "morality" is a fact about human evolution; that we are programmed by evolution and culture to regard certain behaviors of others as acceptable and other behaviors as less so. Once this is understood, Lewis's confusion simply vanishes.
He calls this "pugilistic, pugnacious, and pernicious propositions."
Students of Groothuis should be aware: he does not tolerate any kind of dissent. If I were you, I'd look for another teacher, one that respects the give-and-take necessary to acquire knowledge.
I think this excerpt shows why, for non-Christians, C. S. Lewis's philosophy is regarded as deficient.
Lewis didn't know anything about evolution. He didn't understand that what he called "morality" is a fact about human evolution; that we are programmed by evolution and culture to regard certain behaviors of others as acceptable and other behaviors as less so. Once this is understood, Lewis's confusion simply vanishes.
He calls this "pugilistic, pugnacious, and pernicious propositions."
Students of Groothuis should be aware: he does not tolerate any kind of dissent. If I were you, I'd look for another teacher, one that respects the give-and-take necessary to acquire knowledge.