In the enterprise world, data is never clean. Data is never complete. Data is never without errors and it is never all in the one place. Worse still, often the meaning of the data changes over time as business processes change or the systems that store the data are upgraded.
There is an age-old saying in the computer industry that goes: “Garbage in, garbage out” Or in other words, if you feed a computer data with issues, don’t expect an answer without issues.
All this does not bode well for AI, which likes to consume vast amounts of data.
Traditionally, the data science approach has been to prepare the data first – to plug the gaps and correct the data as much as possible so that you are comparing apples with apples.
I’d like to suggest a different mindset. A communication engineer’s view of the world is from the perspective of signal and noise. There will always be noise. The goal is to maximise the gain of the signal in the presence of noise.
So, rather than over-invest in data prep (particularly since the value of the data is uncertain in the early stages), instead prioritize AI approaches that:
- are resilient to noise and agnostic to the meaning of the data
- can start as a small pilot then quickly scale to your entire customer base
- continuously learn as your data changes
As we demonstrated to an Australian energy retailer recently for predicting customer churn, this approach can achieve unexpectedly high levels of prediction accuracy.
But prediction alone is not the answer.
A system that just predicts doesn’t deliver a business outcome. Instead you need two engines – one to constantly predict based on changing data and a second “treatment engine” to decide when and how to act or not-act on the prediction. Where is the cut-off point that maximises the benefit we receive? In the case of customer churn, how do I minimise the combined cost of either keeping my customers or replacing them? How do I maximise the total number of customers? Can my system learn from my attempts to reduce my churn?
These are some of the principles we hold dear in our software.