Wednesday, March 01, 2006

Solution to the Birthday Puzzle

Yesterday I posed the following puzzle: On what day of the year do the most people celebrate their birthday?

The answer I expected was, in fact, yesterday: February 28. The rationale is that about 1/(4*365) of all births fall on the leap day, February 29 --- so roughly 3 out of every 4 years, those lucky leapday people have to celebrate on some other day. For psychological reasons it seems very likely that most people will celebrate on February 28 instead of March 1, so this means that February 28 gets a boost of up to 3/16, or 18.75% more celebrants, averaged over many years.

This analysis, however, doesn't take into account the fact that human birthdays are not uniformly distributed. I've found several datasets on the web and in the literature, and each of them exhibits rather wide variablity. For example, an article by Geoffrey Berresford in Mathematics Magazine 53 (1980), 286-288 reveals that for one dataset (births in New York State for the calendar year 1977), the least likely day to give birth was December 11 (.2135%) and the most likely was July 6 (.3478%). The percentage spread for this dataset is 100%*(max-min)/min is 62.9%, easily large enough to swamp the leap year effect! In particular, significantly more people are born during July, August, and September than January, February, March, and this could easily change the results.

However, there are at least two problems with relying on the data from a single year. First, there is a huge weekly fluctuation in birthdays. At least in North America, people are much less likely to be born on a weekend than on a weekday. I imagine this is due, in part, to the fact that many people are born via Caesarean section, and doctors tend to take the weekends off. (A more sinister explanation might be that doctors tend to manipulate birth times with drugs to make sure they occur during a weekday.) Thus, we cannot rely on the data for single day, but need to average this over many years to make sure we get accurate data. Fortunately, there are some other datasets available. This one appears to be for the year 1978; it has a spread of 50.1%. One commenter suggested looking at this dataset, represents applications received over a 14-year period from a life insurance company, and so probably tends to average out the weekly effects. It has a spread of only 38.5%, if one ignores the data for February 29.

A second, but related problem is that some dates are more likely to fall on a particular day than others. For example, the 13th of the month is more likely to fall on a Friday than any other day! (This was noticed as early as 1933 in a Monthly problem; for the reference and an explanation, see here.) As it turns out, February 28 falls on a weekend 28.75% of the time, and February 29 falls on a weekend 28.5% of the time, whereas a uniform distribution would give about 28.57% for both. Coupled with the observation in the previous paragraph, this might be another reason why February 28 might be under-represented, although the effect is quite small, affecting only the 3rd decimal place.

If we use the insurance data mentioned above, then there were 1319 births on February 28 and 325 on February 28. Since in 3 out of every 4 years, the leapies will celebrate their birthdays on February 28, this gives an effective score of 1319 + (3/4)*325 = 1562.75 for February 28. This just barely beats out the otherwise-most-frequent date of August 15, with 1559 births. Here the difference is in the 3rd decimal place! Maybe February 28 really is the most frequently celebrated after all, but if so, it's very close.

5 comments:

PaulC said...

(A more sinister explanation might be that doctors tend to manipulate birth times with drugs to make sure they occur during a weekday.)
I've witnessed it firsthand. The sinister drug is called pitocin.

C-sections are not the only way to schedule a birthday.

Jonathan Badger said...

the least likely day to give birth was December 11 (.2135%) and the most likely was July 6 (.3478%).

Hmmm. Spooky coincidence time. John Kerry, who lost the last election, was born on Dec 11, 1943. The winner, George W. Bush, was born on July 6, 1946.

Ithika said...

I had a conversation with someone at work about 29 February birthdays. They knew someone who was born on that date so I asked when they celebrated on the 'off' years.

And the answer — the nearest day to the weekend!

So you might be more right than you guessed ;)

Don Sheffler said...

I love stuff like that.

And this:
"the least likely day to give birth was December 11 (.2135%) and the most likely was July 6 (.3478%).

Hmmm. Spooky coincidence time. John Kerry, who lost the last election, was born on Dec 11, 1943. The winner, George W. Bush, was born on July 6, 1946."


________

My own personal favorite tidbit is regarding the exact middle of the year:

It's not at midnight between June 30 and July 1 as you might envision because the calendar is a little top heavy on the second half. The middle of the calendar year is at noon on July 2.

(This is excepting all the leap seconds and other anomalies. On leap years it's midnight between July 1 and July 2.)

Carry on.

Caleb said...

One interesting fact has to do not with births on a specific day but in a specific year. The chinese lunar calendar rotates between 12 different animals every year, and of these, the dragon is considered the best and most auspicious. As such, the number of births in a dragon year has been known to be up to 20% higher than in normal years. This creates a huge strain as services such as schools have to cater from the excess capacity compared to normal years.

Also interesting is the idea that births could be seasonal. Is it more likely that children are conceived during holiday periods? Does marriage dates affect when babies are born?