How to Set Up Drift Detection

📘

The following how-to guide is specific to classification models. If your model is not a classification model, then this how-to guide will not apply. Check back soon for more details on evaluating drift for other types of models.

Format model outputs

Before you can start using Modzy to track and monitor model drift, you'll need to make sure your model outputs are compliant. Please see Prediction Drift Formats for more information on how to format your model's output to be compliant with Modzy's requirements.

Establish model baseline

Output drift detection is tracked by comparing model predictions against a baseline period of predictions that are considered "normal". Before you can begin tracking your model's output drift, you will first need to first run an initial set of inferences or predictions to use as the Baseline Period for your model. The baseline should specify a period of time when the data was regular or typical, data against which all incoming data can be compared.

🚧

Recommend minimum of 50 inferences before drift can be tracked

For best results you'll want to send 50 predictions through your model with at least 5 predictions for each category being observed for drift.

Configure drift settings

Select baseline period

After your model has run sufficient inferences, it should up on the Drift page within the Operations section. Once you see your model, it should say "Needs drift configuration". Click on your model to set the baseline period.

The baseline period can be set using the calendar date picker or by directly typing a date into the beginning and ending boxes. The system will display all completed inferences during that period.

Baseline period date range selectorBaseline period date range selector

Baseline period date range selector

Set drift thresholds

Thresholds are used for reporting at what point drift falls into a Nominal, Medium or High range and renders a warning. You will be notified of potential prediction drift on your home dashboard as well as on the primary drift page. Configure these values by sliding the two buttons on the bar, or typing your values directly into the boxes.

Threshold selectorThreshold selector

Threshold selector

View drift results

Once you have set your baseline period and drift thresholds, you can return to the Results tab to view model drift overtime. There may not be any data to display at first, but this chart will automatically update overtime as your model produces more predictions.

Drift results pageDrift results page

Drift results page

Using Modzy's API

Alternatively, you can set the drift baseline more precisely using the Modzy API. For example, we can submit a set of 50 jobs from a DataFrame, then set exactly those 50 jobs as the baseline with a code snippet like this one:

baseline_jobs = process_df(baseline_data[:50], 'id', 'data', modzy_client, model_id='ed542963de', model_version='1.0.1')

baselineStartDate = baseline_jobs[0].submitted_at
results = modzy_client.results.block_until_complete(baseline_jobs[-1], timeout=None)
submitted = datetime.datetime.strptime(baseline_jobs[-1].submitted_at,'%Y-%m-%dT%H:%M:%S.%f%z')
latency = datetime.timedelta(milliseconds=results['elapsedTime'] + results['initialQueueTime'] + results['totalModelLatency'] + results['totalQueueTime'])
baselineEndDate = (submitted + latency).strftime('%Y-%m-%dT%H:%M:%S.%f')[0:-3]+'+00:00'

data = {
    "baselineStartDate":baselineStartDate,
    "baselineEndDate": baselineEndDate
}

url = modzy_client.base_url + '/models/'+'ed542963de'+'/versions/'+'1.0.1'+'/projects/'+project_id+'/drift/baseline-period'
update_drift = requests.post(url, headers = admin_headers, json=data)

Did this page help you?