Consider the probability that no two people out of a group of will have matching birthdays out of equally possible birthdays. Start with an arbitrary person's birthday, then note that the probability that the second person's birthday is different is , that the third person's birthday is different from the first two is , and so on, up through the th person. Explicitly,
(1)
| |||
(2)
|
But this can be written in terms of factorials as
(3)
|
so the probability that two or more people out of a group of do have the same birthday is therefore
(4)
| |||
(5)
|
In general, let denote the probability that a birthday is shared by exactly (and no more) people out of a group of people. Then the probability that a birthday is shared by or more people is given by
(6)
|
In general, can be computed using the recurrence relation
(7)
|
(Finch 1997). However, the time to compute this recursive function grows exponentially with and so rapidly becomes unwieldy.
If 365-day years have been assumed, i.e., the existence of leap days is ignored, and the distribution of birthdays is assumed to be uniform throughout the year (in actuality, there is a more than 6% increase from the average in September in the United States; Peterson 1998), then the number of people needed for there to be at least a 50% chance that at least two share birthdays is the smallest such that . This is given by , since
(8)
| |||
(9)
|
The number of people needed to obtain for , 2, ..., are 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, ... (OEIS A033810). The minimal number of people to give a 50% probability of having at least coincident birthdays is 1, 23, 88, 187, 313, 460, 623, 798, 985, 1181, 1385, 1596, 1813, ... (OEIS A014088; Diaconis and Mosteller 1989).
The probability can be estimated as
(10)
| |||
(11)
|
where the latter has error
(12)
|
(Sayrafiezadeh 1994).
can be computed explicitly as
(13)
| |||
(14)
|
where is a binomial coefficient and is a hypergeometric function. This gives the explicit formula for as
(15)
| |||
(16)
|
where is a regularized hypergeometric function.
A good approximation to the number of people such that is some given value can be given by solving the equation
(17)
|
for and taking , where is the ceiling function (Diaconis and Mosteller 1989). For and , 2, 3, ..., this formula gives , 23, 88, 187, 313, 459, 622, 797, 983, 1179, 1382, 1592, 1809, ... (OEIS A050255), which differ from the true values by from 0 to 4. A much simpler but also poorer approximation for such that for is given by
(18)
|
(Diaconis and Mosteller 1989), which gives 86, 185, 307, 448, 606, 778, 965, 1164, 1376, 1599, 1832, ... for , 4, ... (OEIS A050256).
The "almost" birthday problem, which asks the number of people needed such that two have a birthday within a day of each other, was considered by Abramson and Moser (1970), who showed that 14 people suffice. An approximation for the minimum number of people needed to get a 50-50 chance that two have a match within days out of possible is given by
(19)
|
(Sevast'yanov 1972, Diaconis and Mosteller 1989).