Professional Documents
Culture Documents
Abstract Being one of the most well-known and famous problems in 2. BACKGROUND
probability1, the question is asked: “How many randomly chosen people
are needed to achieve at least a 50% probability that some pair will both Comment [JM4]: This could be
have been born on the same day?” Since the chance of any two persons Mathematical Model compressed by simply giving a single
having the same birthday is remote, many of us would expect this number equation and pointing the reader to a
One of the basic rules of probability: the sum of the probability that an reference whre more detail is given.
to be rather large. However, it turns out that this is not the case, and
event will happen and the probability that the even will not happen is
hence the paradox.
always 1. In other words, the chance that anything might or might not
happen is always 100%.
A simulation of empirical testing and results was conducted to simulate
multiple trials, where people of a given group size have their birthdays
If we can work out the probability that no two people will have the same
compared. The probability is consistently monitored and established as
birthday, we can use this rule to find the probability that two people will
the number of successful experiments as a proportion of the total number
share a birthday:
of experiments performed.
As the theoretical value is already known (23), it was also the goal-driving P(event happens) + P(event does not happen) = 1
→P(two people share birthdays) + P(no two people share birthdays) = 1
result for the algorithm being implemented in Java, and happily, was
achieved.
P(two people share birthdays) = 1 – P(no two people share birthdays) Comment [JM5]: Mixed font size.
Comment [JM1]: Keep it formal and to
The formula for the probability that n people have different birthdays
the point.
(month and day) is3:
1. INTRODUCTION 365
Premise born in 19382: In a room of just 23 people, there is a 50% 365 n! * 365n (1)
Comment [JM2]: Use a reference rather
probability of two of these people sharing the same birthday, (ignoring than a footnote.
years of birth). In a room of 75 people, there is a 99.9% chance of two Therefore, the probability that at least two of them share the same birthday
people with matching birthdays. This is one particular case of exponential is:
(Saliusian) sets, where duplicates are allowed. Exponents are not intuitive, 365 Comment [JM3]: A reference would be
1 (2)
and thus why our linear-thinking leads us to an incorrect estimation! 365 n! * 365 n welcome here.
That there are only 365 days in a year, i.e. thus ignoring leap years.
This also results in ignoring the suspension of leap day on years
divisible by 100 that are also divisible by 400.
Birth years are ignored.
People’s birthdays are equally distributed throughout the year; (i.e.
influencing elements such as seasonality are not factored in).
Obviously in real-life, birthday distributions are not uniform, i.e. not
all dates are equally likely.
The date of a person’s birthday does not affect the date of another
person’ birthday, i.e. twins, triplets, etc.
Figure 1: A graph showing the approximate probability of at least two
people sharing a birthday amongst a certain number of people 4.
1
This is not a paradox in the literal sense – it just highlights the fact that people
expect the value to be much larger. 3
As Dr Math FAQ – The Birthday Problem
2
American Mathematical Monthly in 1938 in Zoe Emily Schnabel's The estimation (http://mathforum.org/dr.math/faq/faq.birthdayprob.html)
of the total fish population of a lake, under the name of capture-recapture statistics. 4
Wikipedia - Birthday Problem (http://en.wikipedia.org/wiki/Birthday_paradox)
3. METHOD The last record in this table is graphed in Figure 2. Preparing graphs for
all records processed in this table (Appendix 4) reveals, interestingly, that
time performance dips in proportion to the curve of probability when
Programming Methodology tending to 23 people, irrespective of the number of trials:
Simulation 8
The simulation was written in the Java language 5. A text file that lists
multiple trials with the following parameters serves as input to the 0.60000 25000
Probability
Time (ms)
Starting group size (default: N = 2) 0.30000
5000
Assumptions: 0.10000
start the simulation with N = K. The reason is simple: How do No. People
For each trial, a number of random birthdays are generated and placed into Another approach to solving this problem would be to base it on collisions
an array; (here, we use the Julian Date format, 1…365). These birthdays – by tracking as each person enters the room and checking to see if there is
are sorted and then iterated through to find the same values in consecutive a match with any other person. An array of 365 elements would only be
elements in the array, denoting a success. Once the trials have completed needed, and a random date only generated for each person until either the
running, the probability is evaluated as6: entire group size is exhausted or a match has been found. The author of
this paper decided against this approach after initially selecting it, as it Comment [JM10]: We
currProbability = numSameBirthday / numTrials would not be possible to measure time performance as fluidly.
Comment [JM6]: Why not use algrbraic
notation as in (1) and (2). Also: This eqn is
not numbered.
4. RESULTS & DISCUSSION 6. REFERENCES
With the availability of having an input file, multiple case scenarios can be [1] Birthday Paradox Wikipedia.com
generated. Table 1 is an example of sample data was available on the (http://en.wikipedia.org/wiki/Birthday_paradox)
input file, (as per aforementioned format in Section 3):
2 10000 2 1 0.5 [2] CS1101C Lab 3 – Birthday Paradox National University of Singapore,
2 20000 2 1 0.5 School of Computing
2 30000 2 1 0.5 (http://www.comp.nus.edu.sg/~cs1101cl/labs_sem2_0405/lab3/oddweek/)
2 50000 2 1 0.5
2 700000 2 1 0.5
2 1000000 2 1 0.5 [3] How to Generate Random numbers About.com
2 2000000 2 1 0.5 (http://java.about.com/od/javautil/a/randomnumbers.htm)
2 5000000 2 1 0.5
[4] Quick Sort Implementation with median-of-three partitioning and Comment [JM7]: Explain the entries in
Table 1: Input File sample data. cutoff for small arrays Java-Tips.org the table.
(http://www.java-tips.org/java-se-tips/java.lang/quick-sort- Comment [JM8]: Why capitalise?
implementation-with-median-of-three-partitioning-and-cutoff-for-small-
5
Adapted from [2] a.html)
http://www.comp.nus.edu.sg/~cs1101cl/labs_sem2_0405/lab3/oddweek/paradox.c
6
See Appendix 1 – Pseudocode.
APPENDIX 1 – PSEUDOCODE
DO
{
SET variable numSameBirthday = 0
Check for match between consecutive elements: IF TRUE THEN numSameBirthday = numSameBirthday + 1 Comment [JM11]: Complexity of th
could have been discussed in the paper.
} END FOR
} END WHILE
} END DO LOOP
APPENDIX 2 – SOURCE CODE
BirthdaySimulation.java
BirthdaySimulation.java
Simulation 1
0.60000 35
30
0.50000
25
0.40000
20
Probability
Time (ms)
0.30000
15
0.20000
10
0.10000
5
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
Simulation 2
0.60000 80
70
0.50000
60
0.40000
50
Probability
Time (ms)
0.30000 40
30
0.20000
20
0.10000
10
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
0.60000 120
0.50000 100
0.40000 80
Probability
Time (ms)
0.30000 60
0.20000 40
0.10000 20
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
Simulation 4
0.60000 180
160
0.50000
140
0.40000 120
Probability
Time (ms)
100
0.30000
80
0.20000 60
40
0.10000
20
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
0.60000 3000
0.50000 2500
0.40000 2000
Probability
Time (ms)
0.30000 1500
0.20000 1000
0.10000 500
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
Simulation 6
0.60000 6000
0.50000 5000
0.40000 4000
Probability
Time (ms)
0.30000 3000
0.20000 2000
0.10000 1000
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
0.60000 9000
8000
0.50000
7000
0.40000 6000
Probability
Time (ms)
5000
0.30000
4000
0.20000 3000
2000
0.10000
1000
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People
Simulation 8
0.60000 25000
0.50000
20000
0.40000
15000
Probability
Time (ms)
0.30000
10000
0.20000
5000
0.10000
0.00000 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
No. People