Back in late January I authored a blog post discussing the role data has as an ingredient in analytics but how it’s not much use without the recipe. I’m pleased to say this promoted a number of good discussions and a couple of weeks ago I hosted a Qlik Supper Club in London where we talked about the topic further. The consensus was overwhelming that a discussion on data only is of little value, and all the customers attending echoed that the use case and understanding of the skills in your organization, what I referred to as the recipe, is much more interesting.
It also prompted another debate. Why is there so much focus in certain parts of the analytics market on data, when actually those same parts of the market don’t treat your data very well? This seems like a fairly significant contradiction with there being a lot of discrimination resulting in not all your data being treated equally. And let’s be honest, given all the variation we have in our data, all the multiple sources and multiple tables that we have to work with getting this right should be pretty high up the priority list. So I thought I’d investigate.
After talking to a few experts, it turns out that some tools out there in the market force the relationship between data coming from different data sources, making data from what they call the unique ‘primary data source’ more important than data from the ‘secondary data source(s)’ – a left join. This turns out to be pretty important because often, companies have incomplete data and if you have incomplete data and use a left join you create asymmetry between your data and may well miss rows as well as lose important dimensions. All pretty technical stuff but the result is pretty obvious to me - your data is not being treated equally! What’s more, if you’re a business user like me you may not even be aware of this. If you do spot it then you may end up needing technical skills and additional data integration capabilities to work around the problem and that’s not very self-service.
Perhaps we at Qlik just take this for granted, as our associative engine enables you to work with all data from all your data sets, dynamically creating relationships between the data as needed after every user interaction. I’m told it is technically called a “many to many full outer join”, but for me it just always works so put simply, we offer greater analysis capabilities (all dimensions, all data) allowing users to make discoveries in places that other solutions would leave you blind. Why? Because they might simply exclude that critical piece of data, that gem, that nugget which hides an insight that might just change the decision you make, and who knows which direction that might take you?
Bottom line: We believe you need to work as you think, and not how your data is structured, you need free-form unrestricted analysis and agility so you can rapidly innovate, always asking the next question. No blind spot, no data is left behind and all data is treated equally.
Photo credit: Beraldo Leal / Foter / CC BY