I am the first to admit that I am not a full-time app developer, nor do I have an advanced degree in business intelligence or analytics. But, my curiosity to answer the unanswered questions guides my data analysis.
That curiosity stems from early education: I can remember back to my 6th grade science class and being taught about how to use the Scientific Method to complete our experiments. Each experiment required five steps: Question, Hypothesis, Prediction, Experiment and Analysis. We were told to carefully log the results of each step and share with the class.
It's 2016, and I found myself following the Scientific Method once again in seeking to answer a nerdy sports question: how many New England hockey players have been drafted as part of the QMJHL Draft over the last five years. For a bit of added context: the Quebec Major Junior Hockey League annually holds it's age 16-20 player draft in June and selects the top players from Quebec, New Brunswick, Nova Scotia and Prince Edward Island. As of 2012, US players from New England were required to be drafted by each team as well, hence my question: how many have had their name called?
For a hypothesis, I assumed that the number would spike in 2012 over 2011 and I figured that the number would continue to grow from 2013-2015 as more teams became familiar with the New England talent pool.
As for a prediction, I generally assumed each team would select between 2-3 New England players each year, and with 18 teams that would be a maximum of 54, extrapolate that over 5 years and you get 270 players. And since there was no rule in place in 2011 I cut that season's total to 1 New Englander selected per team which gives you a final tally of 234 players.
Next came the "experiment" for which I built a data model in Qlik Sense which is as basic as they come. The model allowed me to compare not only what region the players hail from but what heights were most regularly selected, which teams drafted the most players from New England, etc.
And now for the fun part: the analysis! In the app I built a series of bar charts and treemaps to answer my initial question. It turns out that the final number was 170 players: 64 fewer than my prediction! By getting a closer look at the data, it is clear that the teams prefer to pick north of the Canadian border with 1,058 players (86%) selected over the last 5 years. Additionally, I did see a major spike in 2012 when the rule was added, but there hasn't been a yearly increase: you can see the numbers actually drop in 2015...
Chalk up another victory for the Scientific Method! Now I guess I have to find another question to answer...
Photo credit: JHikka via Foter.com / CC BY-NC-ND