Thinkings about MLOps in 2023

5 min readMar 10, 2022

The development flow of a machine learning (ML)-based system is shown in the figure. Basically, an ML system is made up of data, code, and the model. There are loads of ML frameworks for training and serving these days, but there’s no standard way to build ML systems yet. The industry came up with MLOps, which is like DevOps. However, different companies (like Microsoft, Google, etc.) have their own versions of MLOps. Even though there are different MLOps versions, the main goal is to make it easier to retrain and update ML models.

Components of Machine Learning-based Applications

Why do we need the loops?

Once an ML model is added to a software system, its performance can get worse and start acting weird over time. So, we’d need new data to train our model again. Also, at some points during system development, we might need to redo some processes and either collect more or different data and relabel training data. This should kick off the retraining of the ML model.
After examining the available data, it’s difficult to get the data for solving the problem we previously defined, so we would need to re-formulate the problem
After serving the model to the end-users, we might recognize that the assumptions we made for training the model are wrong, so we have to change our model
Sometimes, the business objective changes and we decide to change the adopted machine learning algorithms.

In summary, 3 common issues influence the performance of ML models in the production environment:

1) Data quality: since ML models are built on data, they are sensitive to the semantics, amount, and completeness of incoming data.

2) model decay: the performance of ML models in production degenerate over time because of changes in the real-life data that have not been seen during the model training.

3) Locality: when transferring ML models to new business customers, these models, which have been pre-trained on different user demographics, might not work correctly according to quality metrics.

MLOPs Cycles

We have seen many circle diagrams on the internet. Various organizations have developed different types of MLOps, similar to DevOps. When I worked as an AI Engineer in Shenzhen, our small team followed similar MLOps principles. However, since startups are often short-staffed, we didn’t even realize we were practicing MLOps while juggling numerous projects.

In reality, many machine learning and data science projects are not well-organized, as things don’t always go as smoothly as depicted in the above figures. Google has classified MLOps into Level 0, Level 1, and Level 2. To be honest, most researchers and startups operate at Level 0 MLOps, which means they don’t have a CI/CD process. This is totally acceptable, as models are rarely changed or retrained, even applied in production!

MLOps in Reality

In the real world, models may fail due to previously mentioned issues. In Level 1 MLOps, automated training and production are introduced when new data comes in. This process is called Continuous Training (CT). Some tech companies might not develop as many tools as shown in the above diagrams, but they do create some automated tools for data scientists or engineering teams to retrain new models more efficiently. Level 2 MLOps introduces CI/CD pipelines for managing source repositories and deploying models based on Level 1 MLOps. The CI/CD automation pipeline can continuously retrain and relabel models quickly and automatically.

Even though Google categorizes MLOps into Levels 0, 1, and 2, a lower level doesn’t necessarily mean a poor ML development process; they are still suitable for different purposes. With a higher-level MLOps framework, more people and time are required to build the system. In fact, most companies only hire one or two data scientists, who often also work as data engineers and backend engineers. A complex workflow might delay the project delivery!

Therefore, I promote the following principles of using different MLOps cycles for your applications or products:

Choose MLOps level 0, if you are building prototypes or systems that don’t need frequent updates. These systems typically operate in a pre-set and stable environment. For example, a machine vision system might detect people wearing facial masks when viewed directly from the front angle.
Choose MLOps level 1, if your system needs to work in a more stable yet changeable environment. For example, a backend face detection system might be used to detect employees from multiple CCTV video footage. Typically, you’ll need to update the face detection and person identification system several times when the system is affected by lighting and other interferences.
Choose MLOps level 2, if your system requires frequent updates. The recommendation systems used by TikTok or Amazon are prime examples, as they constantly adapt to capture users’ interests effectively. Also, time-series models in production should also be considered at MLOPs level 2.

Conclusion

Lower-level MLOPs do not mean poor management of your system. Complexity does not mean advance!
Higher-level MLOps are for scalable and frequent-update systems.
The biggest challenges of MLOps are the fast-changing business objectives, not the tech itself. There are always frameworks, tools, and platforms available.

Reference

[1] https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning#mlops_level_0_manual_process

[2] https://ml-ops.org/content/motivation

[3] https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-technical-paper