I
received one of those Facebook emails the other day telling me that three of my
friends had birthdays on the same day. Now this might not have been too much of
a surprise to me if I had about five hundred friends, but I don’t.
Should I have been
surprised given that I only have about fifty friends on Facebook ?
This
is the sort of question that is all about chance: something that statisticians
learn about at school or college and subsequently use in their work. But
sometimes even seemingly simple questions like this can present somewhat
difficult analysis for a statistician. One difficulty for us to move forward is
that we have to make an assumption that all birthdays are equally likely, and
clearly they are not [see box 1].
You will often see assumptions stated by statisticians,
they help to describe the accuracy of conclusions that have been made from
statistical analysis (http://en.wikipedia.org/wiki/Statistical_assumption).
Box 1: Assumption that
all birthdays are equally likely.
It
turns out that there isn’t that much difference in the UK month on month.
There is a peak of births in July in the UK (about 70,000 in July 2011) (http://data.un.org/Data.aspx?d=POP&f=tableCode:55),
and a trough in February (about 61,000 in February 2011).
As
an aside, the peak was in August in the US in 2010 (http://www.statisticbrain.com/birth-month-statistics/).
Before
trying to answer this question lets change the situation completely and
consider other coincidences and chance findings we might experience socially.
As a family, we were on holiday the other week and we met someone I knew “Well
fancy seeing you here”, I said.
Should I have been
surprised [see
box 2] ?
Box 2: Assumption that
all coincidences are equally likely.
In
this situation I hadn’t said beforehand to my family that I would meet a
particular person, at a specific event, on a certain day, at a point in time.
It is an imprecise research question. As statisticians we can’t put numbers on
coincidences like this, and maybe I shouldn’t have been at all surprised given
the wide net that this question posed.
And
now to the answer to my first question, i.e. Facebook birthdays. This is a more
precise research question and it can be answered by statisticians. I am
specifically talking about friends, (not work colleagues, doctors, dentists or
whoever else I know), birthdays (not random events), and on a specific day (not
any unforeseen day).
Usually
in statistics we look at a table of critical values (e.g. z, t, and χ2,
etc) to make a conclusion about our findings. In this case we use an online
calculator. This tells me that I should be surprised if there were three people who shared
the same birthday in less than 88 friends (this is my critical value).
And
in true statistical fashion we use something called a test statistic to see how my observation in practice compares with the critical value. My test statistic is three people
sharing the same birthday in 50 friends on Facebook. Since 50 is less than 88, I was correct in being surprised that three people share
the same birthday in so few friends.
But then again, I did say earlier that we have to be aware of the assumptions !
And no comments please about my lack of friends.
But then again, I did say earlier that we have to be aware of the assumptions !
And no comments please about my lack of friends.
References
Byron
Jones and Robb Muirhead (2012) What a
coincidence! It’s not as unlikely as you think Significance 9 (1) pp.40-42
Mario
Cortina Borja (2013) The strong birthday
problem Significance 10 (6) pp.18-20
Visualisation:
In
an earlier blog I discussed a presentation I gave to a group of school children
about being a statistician (to one school in Edinburgh). You can find it here:
I
found this interactive graphic, from the Office for National Statistics (ONS),
which provides a picture of babies names in England & Wales – it’s not
comparable with my presentation but it is an interesting way of showing the
results (http://www.ons.gov.uk/ons/interactive/top-100-baby-names-in-england-and-wales---dvc11/index.html)
If you like what I
talk about then:
Follow
me on Twitter: https://twitter.com/IDMorton001
Connect
with me on LinkedIn: www.linkedin.com/in/idmorton
See
my Presentations on SlideShare: http://www.slideshare.net/IDMorton001
Here
is an example of a presentation I recently gave entitled “Process Improvement
& Design of Experiments – Lessons Learnt from a European Statistics
Conference”: