The Machine Learning Process

by | Last updated Apr 10, 2024 | Agile, Life Cycle, Project Management

The machine learning process defines the flow of work that a data science team executes to create and deliver a machine learning model. In addition, the ML process also defines how the team works and collaborates together, to create the most useful predictive model.

A High Level Machine Learning Process

A high level view of the steps in the machine learning process was described in our post on machine learning life cycles. In short, this workflow includes problem exploration, data engineering, model engineering and ML Ops.

The Benefit of a More Detailed Machine Learning Process

While this high level workflow (which some people refer to as a life cycle) is helpful for providing an overall summary of the phases in a machine learning project, it does not provide an intuitive explanation of the work required to actually create a predictive model.

In other words, a more detailed machine learning process could provide a better non-technical view the work required to build a machine learning model. This enables the entire team to have an intuitive understanding of the steps required to build a model, and hence, how to prioritize the work to be done, and how much time each step might take.

A More Detailed Machine Learning Process

This more detailed process keeps the same high level phases (problem exploration, data engineering, model engineering and ML Ops), but defines the key steps within each phase of the ML process. Below is a discussion on each of the steps in the process.

Problem Exploration

First focus on how the model will be used. In the process, assess the desired model accuracy and explore other details, such as if false positives are worse than false negatives. This phase also includes understanding what data might be available.

  • Define Success: Define the problem to be solved. For example, what should be predicted.  This helps define what data will be needed. Also, make sure it’s clear how success will be measured.
  • Evaluate Data: Determine what are the relevant data sources. In other words, evaluate what data the team will need, how that data is collected, and where the data is stored.  

Data Engineering

Design and build data pipelines. These pipelines get, clean and transform data into a format that is more easily used to build a predictive model. Note that this data might be coming from multiple data sources, so merging the data is also a key aspect of data engineering. This is often where the most time is spent in an ML project.

  • Obtain Data: Assembling the data. This includes connecting to remove data stored and databases, which might be in different formats. For example, some data might be in CSV format, and other data could be available in JSON via web services.
  • Scrub Data: The process of re-formatting particular attributes and correcting errors in data, such as missing values imputation. Datasets are often missing values, or they may contain values of the wrong type or range. Cleaning can include removing duplicates, correcting errors, dealing with missing values, normalization, and handling data type conversions.
  • Explore / Validate Data: Get a basic understanding of the data. This exploratory analysis includes data profiling to obtain information about the content and structure of the data. The goal is to both understand the data attributes as well as the quality of the data.

Model Engineering

This is the phase that most people associate with building a machine learning model. During this phase, data is used to train and evaluate the model. This is often an iterative task, where the different models are tried, and the model is tuned.

  • Select & Train Model:  The process of identifying an appropriate model, and then building / training the model (on training data). The goal of training is to answer a question or make a prediction correctly as often as possible.  
  • Test Model: Run the model on data that the model has not yet seen (such as testing data). In other words, perform model testing by using data that was withheld from training (i.e., backtesting).
  • Evaluate & Interpret Model: Objectively measure the performance of the model. Note that basic evaluation explores metrics such as accuracy and precision, to determine if the model is useable, and which model is best for the specific problem being explored. This evaluation also includes an understanding of when the model makes mistakes. More generally, validating the trained model helps to ensure the model meets original organizational objectives before the ML model is put into production.
  • Tune Model: This step refers to parameter tuning, which, depending on the model being used, can be more an art than a science.  In short, models typically have parameters (i.e., dials for tuning the model), which allows the model to get improved performance via parameter refinement. Simple model parameters may include attributes such as the number of training steps and the initialization of certain values.

ML Ops

Broadly defined, machine learning operations (ML Ops) spans a wide set of practices, systems, and responsibilities that data scientists, data engineers, cloud engineers, IT operations, and business stakeholders use to deploy, scale, and maintain machine learning solutions.

  • Deploy Model: Package and put the model to use (i.e., into production). While this varies from one group to another, the team needs to understand the expected model performance, how the model will be monitored, and in general, key performance indicators (KPIs) of the model.
  • Monitor Model: Maintain the model in production. This includes monitoring the KPIs and proactively working to ensure stable and robust predictions.

The Machine Learning Process Coordination Framework

When most people describe the machine learning process, they focus only on the steps required to build a predictive model (i.e., the steps just discussed) or more generally, the machine learning life cycle. This might be appropriate if the work is being done by one person, such as a researcher doing some analysis.

However, creating and using predictive models is increasingly becoming a team sport. And a modern data science team needs to define both the steps in doing the project as well as how to coordinate among the team members working on the project.

For example, note that the while the arrows in the diagram show a continuous flow, the team might need to go back to the previous phase / step. How does the team determine “when to move forward”, and “when to take a step back”? This is where a coordination framework can be useful.

Together, the steps of the project combined with a coordination framework create a comprehensive process that can guide the team toward successful project execution.

What is a Collaboration Framework?

Whereas the life cycle defines the steps necessary to complete a project, the coordination framework defines how the team coordinates these various steps.

The collaboration framework within an effective machine learning process encourages Agile principles, specifically the concept that small, incremental deliverables with quick planning cycles, to help support the ever-changing business landscape.

Three common agile collaboration frameworks that can work within the machine learning process

  • Kanban Simple, lightweight process centered on a highly visible board that describes the current flow of work
  • Scrum A popular software development framework based on fixed-time product releases
  • Data Driven Scrum A variant of Scrum designed specifically for data science with capability-based iterations

Wrap Up

To maximize the value and effort of a data science team creating predictive models, the team should use an appropriate machine learning process. That process should include both the steps of the project as well as an agile coordination framework.

Learn More

Explore some of our other blog posts:

 

Explore Related Content

Agile AI

Agile AI

As AI practitioners we would like a blog post exploring some underlying concepts and practical tips of Agile AI. So...

read more

Finally...a field guide for managing AI projects!

Artificial Intelligence is unique. It's time to start managing it as such.

Get the jumpstart guide to manage your next project.

Plus get monthly tips on managing AI projects and products.

You have Successfully Subscribed!

Share This