Header tag

Thursday, 24 November 2016

Advent: A Trip To London

A few weeks ago, I received an email (which was sent to all UK employees) from the UK Human Resources department.  I work for a large multinational company, and it's not uncommon to get a standard email reminding employees to check their pension contributions, their benefits package for the year or whatever.  None of these usually apply to me, but this latest one did.  To paraphrase:

The UK government's immigration compliance regulations require all employers to have proof of "Right to Work" documentation for all employees.  This means that we'd like to collect up-to-date copies of this documentation for all colleagues in the UK who joined the company before April 2016.  This usually means your passport, but it could be an alternative form of ID.  We need to see the original passport (or similar), so you must present it in person to a team manager or member of the Human Resources team.  We are setting up dedicated times at each of our UK offices for you to attend the most convenient for you.

I work from home, in Stoke-on-Trent, which is over 100 miles from the nearest office, so I needed to plan a day away from my own usual 'office' and a trip to the most convenient for me.  Although it's not the closest, the London office is the easiest to get to (just 90 minutes on the train from Stoke to London Euston), so I booked the morning - and most of the afternoon - away from my desk, and made the 300+ miles round trip to London less than a week later.

Oxford Street, London, late October 2016 (my picture)

Being asked to make a compulsory journey for the benefit of central government reminded me of another journey, which many people will find very familiar:

And it came to pass in those days, that there went out a decree from Caesar Augustus, that all the world should be taxed. (And this taxing was first made when Cyrenius was governor of Syria.) And all went to be taxed, every one into his own city. And Joseph also went up from Galilee, out of the city of Nazareth, into Judaea, unto the city of David, which is called Bethlehem; (because he was of the house and lineage of David) to be taxed with Mary his espoused wife, being great with child.  (Luke 2:1-5, King James Bible)


Now, I can't say for certain, but I'm fairly sure that the government's immigration compliance regulations are probably related to making sure they collect their tax.  So, if you think covering 100 miles or more to get registered to pay tax is a strange or unrealistic event - and only happened in the first century AD - you might want to think again.

Monday, 7 November 2016

Is That A Lot?

No matter how well we research and present our numerical data, there is always one question that we will probably always face:  "Is that a lot?"  Does it show a lack of understanding on the part of our audience, or did we just not make it perfectly clear that our recommendation is earth-shattering, game-changing and generally just awesome?  There are various reasons why our data isn't being received with the awe that it deserves; here are some ways of addressing the gaps.

External Comparison
If you want to give an effective image of how many people visited your site (either normally, or in response to a marketing campaign) then it may be useful to compare them to an external figure.  For example, if you saw 90,000 people respond to your marketing campaign, you might get asked if that's a lot.  One answer:  it's equal to the capacity of Wembley Stadium in London. 
 
As another example, 8 million people fly in an airliner each day.  Is that a lot?  On the one hand, it's about 0.11% of the total population of the world.  On the other hand, it's almost equal to the population of London (8.67 million).  Is that a lot?




Naturally, it helps to have a list of populations for various cities, towns and countries if you want to keep using external comparisons.

Internal Comparison
Probably more effective than external comparison, this uses your current data on your website, sales, revenue, whichever, and calls out how current performance compares to other parts of your site.  For example, you might compare sales or traffic for shoes with shirts, trousers and socks; or perhaps you'd compare SUVs with sports, hatchbacks and estate cars.


This is most effective if you find that traffic to the different internal sections of your site changes (e.g. seasonally) but isn't going to work well if there's little change in the relative traffic to each part (e.g. if shirts always has more traffic than shoes, and shoes are more popular than trousers, etc.).  You could also express this as a share of total traffic: "Menswear traffic rose from 30% to 40% of total site traffic this week" (which also eliminates the overall variation in site traffic - whether you want to make use of that effect or not).

Trending
- This was the highest for six months
- ...the lowest for eight months
- ... the second highest this year
- ...making it the third lowest in the last five years

If you're going to pursue this strategy, then it also helps to have a reason why things were high six months ago, or low eight months ago e.g. "This month was the lowest for 15 months, when one of our competitors had a massive sale and undercut us for three consecutive weeks." or "This month was the highest for six months, when we had the pre-Christmas sale."   This helps connect the data to real-life events and brings the data to life.  "Do you remember that time when our site was really busy?  Well, it's even busier than that."

- The UK Meteorological Office do this with their "since records began" expression, and according to NASA, July 2016 was the world's hottest month since records began.

 - The UK census showed a population boom that was also the largest since records began.

 - TV data shows that the Rio Olympics in 2016 got the smallest TV audience in Brazil since the 2004 games.  The reason is that more people streamed the games online:  it's always good to have a reason why a metric jumps or falls sharply (read more in my article about moving from reporting to insight).


As you can see, these kinds of 'highest since/lowest since' statements really make great headlines, so don't be afraid of using them if you want to instil a sense of urgency into your reporting or analysis.
If it's been a fairly average month, and hasn't been the biggest/best/worst/lowest month since Christmas/Thanksgiving/Easter/ever, then you could always do a comparison with the previous period.  Year on year, or month over month comparisons are widely used - especially year-on-year (YoY) which conveniently removes any seasonal effects (if it was Back to School this year, it will have been Back to School last year too). 

Trends, of course, are vey easily represented as graphs - line charts or bar charts, depending on your personal preference.  Here's an example I've used in the past, showing the current year trend, and last year's trend.  I thought it was fairly intuitive, and with a bit of stakeholder education (I showed them what it was and what it meant), it became the standard way of showing YoY trends, and the current trend.  The bars are last year; the line is this year.  The colour of the line matched the colour of the particular part of the site being discussed (e.g. blue could be men's wear, pink could be ladies' wear - beware of using red and green, as these are shortcuts for 'bad' and 'good' respectively).



Financial Metrics

If you really want to make your stakeholders take action, connect your recommendations and analysis to the money.  Nobody's sure if 19,354 visitors is a lot, but everybody knows how much £19,354 is, or how much $19,354 will buy you.  Whether you go for a trended view, or an external or internal comparison, you can still say, "We made $54,218 this week.  Is that a lot?  It's 15% more than the week before, but 4% less than the same week last year."  Suddenly everybody's paying attention; and if you're lucky, they'll ask you what you recommend doing about it.  Have your answers ready!

I've written before about actionable analysis -
connecting any metric to a KPI or to a financial figure immediately makes analysis more actionable.


Conclusion
So, when you tell your manager that the figure is $150, and your manager decides it's time to emulate Admiral Kirk by asking "Is that a lot?" you can be ready with a comparative or trended view of the data to say, "Well, it may not buy you a gold watch, but it'll get you two bus tickets to the whales on the other side of San Francisco".

Admiral Kirk asks, "Is that a lot?"

Thursday, 20 October 2016

English Premier League: Which Season Ticket is the Best Value?

In my two previous posts, I've examined the data for the English Premier League for the last ten seasons, reviewing how 'exciting' each season has been.  I've drawn some conclusions, segmented the data and found some interesting data points, but not yet produced anything that's really useful, or that can help a football fan.

It's time to move on, and to provide some useful facts and figures that are more meaningful and more useful than I've written previously - in particular, to look at the relative value and cost of season tickets for each of the teams.  But first, a quick recap:

Post number 1: Less than 10% of English Premier League games are goal-less (0-0) draws.
Post number 2:  Arsenal consistently achieve more goals per game (scored plus conceded) than average, while Everton frequently have fewer goals per game than average.

All very interesting and fascinating and useful to quote, but not really anything you can do anything with.  So far, the best recommendation I could make is: "If you were given the choice between watching an Arsenal game or an Everton game, I'd recommend the Arsenal game."

What I propose to do next is to start connecting the data I have to some additional data that will help form recommendations - in this case (and in most cases in business), money.  Money, in the form of reduced costs or increased sales and revenue, is often the essential part of any business recommendation, and I can apply the same process here.  We know how many goals per game (on average) we will see for each team in the English Premier League, but what we haven't yet identified is how much it would cost to see each game, and how much it will cost per goal.

In order to calculate this, I've taken the data from 2015-16 (the most recent completed season) and looked at the costs of season tickets, using the Sky Sports website for the costs.  I'm using the cheapest standard adult season ticket cost in each case.
Image credit

Jumping straight into the analysis - let's compare the cost of a season ticket to the average number of goals per game for the 2015-16 season:



And then compare the season tickets on a "cost per goal" basis, again for the 2015-16 season:



Isn't it interesting how the data has become more relevant, meaningful and even actionable when you start introducing money?

Arsenal may usually have the largest number of goals per season (or per game), and consistently achieve over-average performance there, but if you want to watch 'exciting' football of their type, you're going to have to pay for it.  (Note that the 2015-16 season was lower than usual for Arsenal, who actually came in below average for goals per game).

If you want the best value for your season ticket, then Man City is the place to go, at just £2.67 per goal - and you'll see plenty of goals too, 

This data could be displayed geographically (are London clubs better value than other regions?) or sorted in various other ways.  Beware, though, while you do this, of introducing apparent trends in your data when there is none:




This one isn't too bad, although it does look like season tickets are coming down in price.




This second one, though, makes it appear that (1) there is a trend, and (2) season ticket prices are going up (which is generally the case).

In Summary

In this series, I've moved from data to analysis to insight:

Post number 1: Less than 10% of English Premier League games are goal-less (0-0) draws. Data, and analysis

Post number 2:  Arsenal consistently achieve more goals per game (scored plus conceded) than average, while Everton frequently have fewer goals per game than average.  Analysis, but still nothing actionable.


Post number 3 (this post):  Arsenal may have the most goals on average, but in 2015-16 the cost of seeing a goal (£10.25) was much higher than the other clubs: 20% higher than the next-highest (Southampton, £8.53) and nearly four times higher than seeing a goal at Man City (£2.67, actually 3.83 times more).

Recommendations:
If you have the choice of watching an Everton match or an Arsenal match as a neutral, pick the Arsenal match.

Buy a season ticket for Many City, Villa, or West Brom.  If you want to follow a London club, the best value season ticket for London was Chelsea at £4.77 per goal, still half the price of an Arsenal ticket.  Actionable analysis.


Review
In a future post, I'll look at this worked example, pulling apart the differences between data, analysis, actionable analysis and insight


Wednesday, 12 October 2016

How exciting is the English Premier League?

So, it's the start of the English Premier League (EPL) season. Sport generates vast amounts of data, all available for analysis and insight, and in this post (and probably a couple of following posts), I will be looking at the English Premier League (football, aka soccer) for recent years and reviewing how the game has changed. This will form a practical look at data, reporting, analysis, insight and actionable analysis.

This is a reconstructed post: I originally posted this in September but the post has since been deleted or lost.  Here's what I can remember of it.


There are a number of questions to be asked (and answered):


How 'exciting' is the English Premier League?

How many goals can you expect to see per game?
How many games end in goal-less draws?
How many games are won by a one-goal margin (perhaps a good definition of a tense, exciting game).

This data can then be used to compare the English Premier League with other leagues (in the UK and abroad).

So, to start with, what's the average number of goals per game (total scored by both teams) for each of the last eleven seasons.

And the answer is:

And how does this compare with the percentage of games that are dull, uninteresting, goal-less draws?


The line graph above shows the percentage of goal-less draws.  It doesn't exactly trend with the average number of goals per game, but when the percentage of goalless draws is high (2008-2009) then the average goals per game is low (less than 2.5).

This does lead to an interesting point that would make marketers and headline-writers happy: "Less than 10% of EPL games end in goalless draws" (excluding 2008-2009).

Now we can see that 2006-2007 had the lowest average number of goals per game, while 2011-2 had the highest; we can then analyse these two seasons side by side - see below - to understand where the differences were.

Key points:
- 2007 had 34 0-0 draws, compared to 27 for 2012.  Only 2008-9 had fewer (25).
- 2011-2 had more games with five, six, seven, eight and ten goals.  
- The highest scoring game in 2006-7 was Arsenal 6 - Blackburn 2.  
- In 2011-12, the highest scoring game was Man United 8 - Arsenal 2.

Finally, which seasons were most interesting from the perspective of one-goal winners?  Not just 1-0, but 2-1, 3-2, 4-3 and so on.   
2011-12, with its huge average number of goals per game, doesn't do so well here.  2006-7 and 2007-9, the two games with low goals per game and high percentage of goalless draws, does marginally better - they were both really mean seasons.

Football data obtained from this football website; others are available.

--

Summary

Analysing the data at this level - with trended comparisons - has given us the ability to compare one time period with another.  There's nothing actionable here, but we get a nice headline about the percentage of 0-0 draws.  In the next post I wrote (chronologically, before the original version of this post was lost), I segmented the data by team, and that provided more interesting insights.

Other articles I've written looking at data and football

Checkout Conversion:  A Penalty Shootout
When should you switch off an A/B test?
The Importance of Being Earnest with your KPIs
Should Chelsea sack Jose Mourinho? (It was a relevant question at the time, and I looked at what the data said)
How Exciting is the English Premier League?  what does the data say about goals per game?

Friday, 23 September 2016

Premier League Excitement - Further Analysis

In my last post I looked at 'How exciting is the Premier League' and produced the interesting data point that less than 10% of Premier League games are goal-less.  This may be interesting, and it might even count as insight, but it's not very actionable.  We can't do anything with it, or make any decisions from it.  I suppose the question is, "Is that a lot?" and I'll be looking at that question in more detail in future.

So, my next step is to look at how the different teams in the Premier League compare on some of the key metrics that I discussed - goals per game (total conceded plus scored), percentage of goalless games and so on.

Number of goals per game (conceded plus scored)

Firstly, I segmented the data per team:  how many goals were there per game for each team in the Premier League.  This is time-consuming, but worthwhile, and a sample of the data is shown below.  I have data as far back as the 2004-5 season, but the width wouldn't fit on this page: 
Club
Y2010
Y2011
Y2012
Y2013
Y2014
Y2015
Y2016
Arsenal
        2.58
        3.03
        3.24
        2.87
        2.87
        2.82
        2.66
Aston Villa
        2.21
        2.82
        2.37
        3.05
        2.63
        2.32
        2.71
Birmingham

        2.50





Blackburn
        2.79
        2.76
        3.32




Bolton
        2.61
        2.84
        3.24




Charlton
        2.47






Chelsea
        2.32
        2.68
        2.92
        3.00
        2.58
        2.76
        2.95
Crystal Palace




        2.13
        2.58
        2.37
Everton
        2.32
        2.53
        2.37
        2.50
        2.63
        2.58
        3.00
Fulham
        2.58
        2.42
        2.61
        2.89
        3.29


Liverpool
        2.21
        2.71
        2.29
        3.00
        3.97
        2.63
        2.97
Man City
        1.92
        2.45
        3.21
        2.63
        3.66
        3.18
        2.95
Man United
        2.89
        3.03
        3.21
        3.39
        2.82
        2.61
        2.21
Middlesbrough
        2.45






Newcastle
        2.24
        2.97
        2.82
        2.97
        2.68
        2.71
        2.87
Norwich


        3.11
        2.61
        2.37

        2.79
Portsmouth
        2.29






Southampton



        2.87
        2.63
        2.29
        2.63
Tottenham
        2.92
        2.66
        2.82
        2.95
        2.79
        2.92
        2.74
West Brom

        3.34
        2.55
        2.89
        2.68
        2.34
        2.16
Wigan
        2.53
        2.66
        2.74
        3.16



Season Average
2.77
2.80
2.81
2.80
2.77
2.57
2.70

Blank columns indicate a season where a team was not in the Premier League.  
Bold figures show where a team achieved over 3 goals per game for the season.
Y2008 indicates the season 2007-2008.
Firstly:  sorting alphabetically makes sense from a listing perspective, but for comparison the data is best sorted numerically (from highest to lowest). 

Secondly:  There's a lot of data here, and clearly a visualisation is needed:  I'm going with a line graph.  And to avoid spaghetti, I'm going to highlight some of the key teams - the team with the highest average number of goals per game; the team with the lowest, and the average.

Thirdly:  to identify the overall highest- and lowest-goal teams, I'm just going to take the totals of the averages for the last nine seasons, and sort them from the list.  Teams that were not in the Premier League for one or more seasons are included based on their performance while they were in the Premier League.

Premier League Teams:  Average number of goals per game over the last 12 seasons:

Club
Average
Arsenal
      2.842
Tottenham
      2.833
Man City
      2.825
Blackburn
      2.816
Man United
      2.807
Liverpool
      2.781
Newcastle
      2.751
Norwich
      2.717
Bolton
      2.705
Overall Average
      2.702
Birmingham
      2.671
Chelsea
      2.670
West Brom
      2.669
Aston Villa
      2.667
Fulham
      2.613
Southampton
      2.605
Wigan
      2.566
Everton
      2.518
Charlton
      2.474
Middlesbro
      2.404
Portsmth
      2.368
Crystal Palace
      2.360

Key takeaways:  
- Arsenal have had the most total goals per game over the last nine seasons (2.842 goals per game)
- Everton have the lowest average number of goals per game for teams which have been present in all 12 seasons (2.518 goals per game).
- Put another way:  Arsenal fans have seen 1296 league goals in the last 12 seasons, compared to 1148 for Everton fans (148 fewer).


Theo Walcott, celebrating during Arsenal's win over Hull, Sept 2016  Image credit

Time for some graphs!

Firstly, average goals per season, for the last 12 seasons, for Arsenal, Everton, the league average, Liverpool (who achieved an average of 3.97 in 2013-14) and Man United (because they're always worth comparing).



This shows clearly that Arsenal (green line) have consistently exceed the league average, falling below it only twice in the last 12 seasons.  Everton (blue) have only once exceeded the average, and that was in the most recent season.  Liverpool have exceeded the average over the last four seasons, but prior to that were consistently below (and similar to Everton).

Connecting this to 'real life' events:

- Everton moving from David Moyes to Roberton Martinez in August 2013 did not make any difference to their 'excitement' factor until the 2015-16 season.

- Arsenal, and Arsene Wenger, could not be called 'boring' based on their goals per game. 

- Brendan Rogers had an interesting time at Liverpool, when they hit the highest goals-per-game for the season for any club in the last 12 years (3.97).  Note that this does not discriminate between goals scored or conceded.

Secondly, adjusting the data to show the difference between each team and the overall average (so that the data shows a delta versus the average).



To give you an indication of Liverpool's remarkable 2013-4 season:  their games had more than one goal per game more than the season average.  Brendan Rogers had an eventful time at Liverpool.

Fulham also had an 'exciting' season in 2013-4, achieving 3.29 goals per game (average was 2.77) - but were subsequently relegated.

In summary:

- Arsenal have had the highest average goals per game over the last nine seasons (2.842 goals per game), while Everton have the lowest, at 2.518 goals per game.
- Arsenal have exceeded the league average goals per game in 10 out of the last 12 seasons, and have the highest average overall.
- Man United have achieved above-average goals per game in nine of the last 12 seasons; however the 2015-16 season was the least 'exciting' they've recorded in that period.

Review

Segmenting the data by team is proving more useful.  It's now possible to make predictions about the 2016-17 season:

- Arsenal to remain most 'exciting', closely followed by Tottenham and Man City.
- Everton to remain the least 'exciting', with 1-1, 2-1 and 2-0 results dominating.
- Man United are extremely unpredictable, especially as they have a new manager this season (although nobody could have predicted the dreadful start they've made to the current season).

The raw data used in this analysis is available from the football data website, among others.

More articles on data analysis in football:

Reviewing Manchester United's Performance
Should Chelsea Sack Jose Mourinho? (it was relevant at the time I wrote it)
How exciting is the English Premier League?  (quantifying a qualitative metric)
The Rollarama World Football Dice Game (a study in probability)