I’m often told that Machine Learning sounds complicated – but it doesn’t have to be. If I was asked to explain ML in 20 words or less, this is what it would sound like:
Understand the problem. Clean up the data. Investigate relationships. Engineer the dataset. Build the model. Tune to high performance.
At its core, ML is pretty straightforward. But it does need to follow a process. Here’s a more in-depth breakdown of the stages that can help you turn your data into proactive learnings:
- Understand – We can’t improve what we don’t understand, so our solutions are always grounded in a deep understanding of a process and the data related to that process.
- Clean – The real world is messy, and data is almost never what we’ve been told. To get data ready for both analysis and (eventually) machine learning, we have to clean and process it.
- Investigate – Before we can teach a machine what is important in a dataset, we have to understand it ourselves. Investigating data is really about driving a deeper understanding of a dataset, its correlations and relationships, identifying patterns, and so on. It’s rare that complex processes have simple solutions, but it’s often relatively simple analysis that sets us on the path of a solution.
- Engineer – Machines are not smarter than humans; they are just great at fast math. But to learn best, they must be taught in very specific ways. This step is about prepping a dataset to train a model in the best way possible, as well as about bringing new information to the model to give it the best chance of seeing the signal we want.
- Build & Tune – This is the fun part — creating, testing, and tuning predictive models. This stage includes retraining models as new data becomes available, as well as assessing model performance over and doing maintenance work to make sure the model continues to deliver value.
Don’t let complex terminology overwhelm you when it comes to using ML. All it takes is 20 words and 1 open mind.