Model retraining is an integral component to a production MLOps pipeline. Modzy provides various machine learning monitoring features - including drift detection and explainability - to ensure you have full insight into your AI models performance in a production setting. The outcomes of these monitoring features collectively can inform action, which can include several different paths. One of these paths, retraining, enables your models continuously improve over time and maintain performance levels appropriate for your use cases.

There are several tools required to build a repeatable, automated retraining pipeline:

  • Prediction and data drift detection*
  • Explainability*
  • Human-in-the-loop feedback (or labelling mechanism)*
  • Retraining dataset
  • CI/CD

* Modzy-provided features


Available Modzy Endpoints

The following endpoints are referenced in the retraining workflow below:

Retraining Workflow

Since Modzy focuses strictly on the operations side of the AI development lifecycle, the act of actually training or retraining models directly within the application is not supported. However, because the training side of the AI lifecycle is flooded with tools, platforms, and frameworks, Modzy integrates directly into organizations preferred training processes, regardless of the construct. After your models are up and running, you are monitoring their performance with drift detection and explainability, and you decide it is time to upgrade a model, Modzy provides you the tools to easily build a retraining pipeline directly back into your existing training processes. This retraining pipeline generation can be described in three simple steps.

1. Query your inference results

First, make a call to the job history endpoint to download all of the jobs you are interested in using to generate a retraining dataset. This will return a list of jobs based on the query parameters you set (time frame, number of pages, jobs, etc.). Next, retrieve the results for each job in your job history list. Doing so will allow you store the raw model results from each of your inference jobs.

2. Isolate the incorrect predictions

After you have queried your inference results, look for the voting key in your results object. This will include the human-in-the-loop feedback that users can generate directly in the UI for explainable inferences or programmatically. The voting key-value pair includes the amount of "up" or "down" votes the particular prediction received. As a model developer, this gives you insight into what users think about the model's performance for any given input.

3. Map incorrect predictions to input data

Based on your job input naming convention you choose (completely up to you), use the isolated incorrect predictions and input key within the results object to map back to the production data you passed through. Regardless of where the production data sits - locally on your laptop, in a cloud storage bucket, or other database - you can identify the data lineage with smart input naming conventions and leverage this information to generate new data-label combinations. When you have enough to feed through your training loop, kick off training again and deploy your latest model version when finished.