I was an Economics student. Or at least I pretended to be one whenever an exam loomed on the horizon. In the end, I managed to graduate and was awarded a BA which irked me. Why not a BSC? In my mind, Economics was a precise science and deserved to be recognised as such.
That was wrong. It is now very clear that Economics is indeed an art and not a science. Anybody who believes differently should refer themselves back to the events of 2008. It is also clear we are not rational economic agents, we are in fact irrational (but predictable) consumers. All very irritating to the liberal economists amongst us.
It was this wrong-headed approach to understanding the world that I applied to my dissertation; a close (and very critical) look at the Cuban Economy. With the recent passing of Fidel, I was reminded of the many hours I had sunk into this completely forgettable work and the analytical errors I made.
My younger self thought it very clear that Comrade Fidel had made some catastrophic assumptions in his economic plans (I was right) and that as a result, the economy would be a wreck (I was wrong). Far from being a wreck, Cuban GDP growth has been consistently above that of its neighbours. That’s especially impressive when you consider the punitive US trade embargo. Alternative measures of economic development (e.g., social welfare) are spectacular; Cuba has the same life expectancy as the US. In other words, my models were quite significantly wrong about the future.
All very interesting (or not depending on your interest in the Cuban economy), but what has this got to do with modelling the future? Well, my early mishaps in economic forecasting are a rather good example of the strengths and weaknesses of analytical modelling in general.
Perhaps I might have had a better luck if I had applied three simple rules that we recommend for all People Analytics projects. I might have also had a better mark for my dissertation:
(1) Be transparent about the ‘confidence levels’ of any prediction.
Anyone who is 100% confident about the predictive power of their model is naïvely foolish. Nobody can be perfectly accurate in their views of the future. Should that person exist, they would be impossibly rich having applied their incredible foresight to the equity markets. In other words, we would know about them. Even Warren Buffet gets it wrong from time to time.
Whenever we build predictive models, we must be very clear about our confidence levels. We should quantify false discovery and make that clear to any reader. Being transparent creates confidence in analysis, a prerequisite to support informed decision making. This is especially essential for HR and critical for building the credibility of all People Analysis.
The certainty with which I presented the conclusions of my dissertation makes for embarrassing and uncomfortable reading today. Especially given that most of my predictions were eventually proved wrong.
(2) Be expansive when defining your dataset but avoid the spurious.
Building a model requires data. That means working with what is available and often making do. Unless of course you can design an experiment that creates new data. In the case of most People Analysts, we often feel limited to the data that sits within the HR systems.
Undoubtedly HR system data is a good starting point. But what other sources of employee information are available? Do other functions (i.e., outside of HR) have interesting data about shift patterns or other operational information about employees? About 80% of employee data is written, text data (e.g., comments in a performance review) – how can we analyse this? Can we use surveying to create new datasets? With some creativity, it is possible to dramatically expand the scope (and potentially the power) of any predictive analysis.
That said, data should be limited to what is pertinent. Just because you know the shoe size of your employees doesn’t mean you should analyse it; we don’t want to create spurious results. We believe in using ‘any and all’ relevant data without being limited by format or systems.
In my case, the Cuban government didn’t seem to be overly willing to collaborate with a British undergraduate. Consequently, my dataset was limited and so, therefore, was my analysis.
(3) Be open to taking a shorter, easier path when it comes to computation.
Many of the People Analysts I speak to are rightly proud of what they have been able to discover using the analytical toolkit of Excel, R and even SPSS. This is analysis that must have taken many hours to compile and requires a solid understanding of statistics. It is also analysis that can perhaps be done faster and better by an intelligent machine.
Fast, interactive forms of data discovery software are now available, including some software that is specialised to support People Analysts. Specialisation and the application of Machine Learning techniques enable users to rapidly test many complex hypotheses using very large datasets. Analysis that might have taken weeks can, thanks to better technology, can instead take hours.
Back in my undergraduate days, hypothesis testing was done the hard way. The lack of intelligent tech meant the range and speed of my enquiry was limited, making it much harder to produce interesting results in a short space of time - I had only one term to write my dissertation and did far too little work during the opening hours of the Student Union bar.
Hasta la Victoria Siempre
Modelling the future and predictive analysis is hard. It will continue to be hard but with the help of technology, we can dramatically increase the number and quality of statistical techniques available to many People Analysts. The consequent growth in capability of many People Insight teams makes me feel it is time for another prediction…
… in the very near future, all HR teams will create forecasts of their key people metrics. And some will even get their forecasts right. Until the victory, always.
Pete Clark and Alex Borekull, Qlearsite.
Qlearsite provides People Analytics technology and services, applying the latest Big Data and Machine Learning technologies to provide Organisations with insights that deliver value. We call our work ‘Organisational Science’, whether that’s predictive toolkits, visual data discovery or survey text analytics using machine intelligence techniques.
For the original article, more resources and information about our work, go to www.qlearsite.com.