Best Practices for MLOps Documentation
Whether it’s a side project or adding a new feature, technical documentation is vital in every project and saves time.
Throughout history, technical documentation has always been needed to act as a medium for passing information or a collection of instructions on using specific tools. Even dating back to the oldest example recorded in the western world, The Rhind Papyrus(ca 1650 B.C.), which contained material on ancient Egyptian mathematics. To paint a mental picture of the significance of this particular subject, let us create an imaginary scenario? Consider how difficult it would be if, for example, You purchased furniture parts from Ikea and it did not come with a manual, and you could not find any material online either, Really think for a second how challenging it would be to assemble such furniture; This scenario should help keep things in context.
General Outline
- Why does Technical documentation matter in Mlops?
- How to Implement Technical documentation in Mlops
- Final thoughts
Why does Technical documentation matter in Mlops?
Figure: THE AI LIFECYCLE FOR IT PRODUCTION
Communication is a recurring theme in all of the above practices. As indicated in the image, operations generally include iterating or returning to prior stages, a common element throughout the lifecycle. This is where technical documentation comes in; these processes must be accurately recorded to ensure repeatability.
MLOps levels are defined by the extent of automation of processes in an Ml pipeline, which according to Google, is divided into three: MLOps Level 0, 1, and 2.
MLOps level 0: Manual process
This degree of automation is most commonly seen in firms or teams just beginning to incorporate machine learning into their products or deploy models to new applications. Since models are seldom modified or retrained, the technical documentation process may not constantly be changing. The graphic below depicts the process flow
Figure: Manual ML steps to serve the model as a prediction service.
Characteristics of MLops Level 0
- Manual, script-driven, and interactive process: In this level of automation, many steps in the diagram above, like data analysis, data preparation, model training, and validation, are all carried out manually.
- Infrequent release iterations: It is presumed that the data science team only run a handful of models during the year and don’t release new model version as frequently.
- No CI: Since the tests and script execution are already in the notebook, continuous integration is neglected since few implementation changes are assumed to happen.
- No CD: Due to the fact there isn’t a lot of model version deployment, i.e., the model is often deployed only once.
- Deployment refers to the prediction service: The end product is deploying the model as a prediction service, i.e., a microservice with a REST API
- Lack of active performance monitoring: Model predictions and actions are not tracked or logged throughout the process.
Security, regression, load, and canary testing may be part of the engineering team’s extensive API configuration, testing, and deployment setup.
Best technical documentation practices for MLops Level 0
- Document all hyperparameters chosen in the model, why they were chosen, and their level of importance to the model
- Monitor and document significant shifts in the quality of the model in production to help detect reason in performance degradation, which gives the cue when to retrain your model with the most recent data
- Invest the time in creating a template used to document the entire process from Data extraction to serving model to handing off the model to the production team.
- Document decisions made in the process, for example, how and where to get data or the labelling methods used for each project.
A very useful software that could be used be in documenting this level of automation is Notion. An excellent article on data science documentation on notion can be found here.
MLOps level 1: ML pipeline automation
Level 1’s goal is to automate the ML process and deliver continuous model prediction services. This ensures that model behavior and concept drifts can be detected and prevented early. Pipeline triggers, metadata management, automated data, and model validation steps are essential to automate the pipeline.
Figure: ML pipeline automation for CT.
Characteristics of MLops Level 1
- Rapid experimentation: The steps of the machine learning experiment are pre-planned. The transitions between processes are automated, allowing for quicker experimentations and increased readiness to take the whole pipeline to production.
- C.T. of the model in production: The model has trained automatically in production using new data based on live pipeline triggers; the machine learning production pipeline may be automated to retrain the models with new data on-demand or on a schedule, depending on the use case.
- Modularized code for components and pipelines: Components and processes are made composable, reusable, and perhaps shareable among different steps.
- Continuous model delivery: A production machine learning pipeline delivers prediction services to newly trained models on new data on a continuous basis.
- Pipeline deployment: At level 0, you deploy a trained model to production as a prediction service. For level 1, you deploy a whole training pipeline that runs automatically and on a recurring basis to act as the prediction service for the trained model.
- A feature store is an optional supplementary component for level 1 machine learning pipeline automation. A feature store is a centralized repository for defining, storing, and accessing features used in training and serving.
Best technical documentation practices for MLops Level 1
Data Versioning: You can check changes in different versions of data using DVC At any time during the process. To get the full context about any experiment previously run you or your colleagues have run. According to the website:
“DVC guarantees that all files and metrics will be consistent and in the right place to reproduce the experiment or use it as a baseline for a new iteration.”
DVC uses Git to monitor and version control the data sets and models. Similar to how Github tracks software engineering changes, Dvc lets us monitor data changes in a project. Click here for a hands-on Data versioning lesson.
Model Monitoring: The company should invest in model monitoring tools to identify when the model makes incorrect predictions and how best to improve the model. The Weights and Biases (wandb) library maintains track of the experiments’ hyperparameters and metrics. Check out this Neptune blog post for additional information about machine learning model monitoring solutions like Optuna and MLFlow.
Template for Project structure: Before embarking on a project, a design should be already laid out to be followed even though variations from the template may occur. Due to the modularized code in the components and pipeline, setting up a standard template from those recommended below might be beneficial.
MLOps level 2 “CI/CD pipeline automation.”
Mlops level 2 employs a robust automated CI/CD system that ensures the pipelines in production are updated in a timely and reliable manner. This automation maturity level with the CI/CD system ensures that data scientists can explore concepts in feature engineering, model architecture, and hyperparameters.
Figure: Stages of the CI/CD automated ML pipeline
Characteristics of MLOps Level 2
- Development and experimentation: This is the stage of coordinated experiments, Where new models are tested on the data to determine whether it improves the relevant metrics. The source code for the machine learning stages is then pushed to the project repository upon completion of this step.
- Pipeline continuous integration: In this step, various tests are run, and more code is added to produce artifacts, packages, and executables to be utilized in the next step of the automated CI/CD pipeline.
- Pipeline continuous delivery: You deploy the artifacts produced in the earlier continuous integration stage to the project target environment.
- Automated triggering: The pipeline is automatically run in production based on previously set triggers or an earlier set schedule.
- Model continuous delivery: The outcome of this step is a fully deployed model as a prediction service.
- Monitoring: Based on real-time data, reports and statistics are compiled on the model’s performance.
To summarise, this level of automation includes more than just delivering your model as a prediction API. Instead, it involves building a machine learning pipeline that can automate model retraining and deployment. Using CI/CD enables you to test and deploy new pipeline implementations automatically.
Best technical documentation practices for MLops Level 2
Testing
The tests should be conducted on three fronts in this automation maturity level: the data, the ML model, and the application. Therefore, we will distinguish these tests for features and data, model development tests, and ML infrastructure tests. The following tests should be run and their results documented.
Features and data tests
Model development reliability test
ML infrastructure tests
Final thoughts
As Organisations begin to integrate Mlops practices into their processes, technical documentation would become more key to the successful implementation of these practices. The documentation practices in this article could be implemented across all automation maturity levels and are to be built upon as each situation or circumstance dictates.
What is your organization doing to bring technical documentation into the MLOps lifecycle? Please share your experience in the comments.
References