Drift is an umbrella term used to indicate a change in model behavior when compared to some baseline – that baseline typically being determined by the dataset used for the model's training. As time passes, models may be requested to run inference on data that differs from this dataset making model predictions less accurate. Typically these changes can be reduced to one of three areas: the outputs drifting, the input shifting, or the model's operating environment changing, which we refer to as prediction drift, data drift and concept drift.
When drift in any one of these areas becomes too great, a new version of the model, trained on an updated or corrected dataset, should be deployed. By monitoring the distribution of input and output data, and comparing it to a baseline, you can quickly detect when drift increases, when performance degrades and identify why models are making incorrect predictions to quickly update and correct these issues.
To find the drift page, navigate to the "Operations" top header, and click on the "Drift" section. Here you will find all of the models that have been identified for review within your team. Click into each one of them to set drift configuration and view details.
Modzy uses a standard Chi² Goodness of Fit Test implementation X² = sum(O-E)² / E to derive our Prediction Drift Error. More detail about chi-squared distributions and their use in quantifying the normalcy of model results is available here. For hypothesis testing we use lookup tables for models with less than 50 recognized classes or Degrees of Freedom; and a normal approximation for models with more than 50 degrees of freedom.
Modzy's Prediction Drift Error is defined as
predictionDriftError = 1 - pValue. The p-value is the probability of observing a test statistic at least as extreme in a chi-squared distribution – put another way, extreme values of the Chi-squared distribution have low probability, and therefore small p-values. Modzy's Prediction Drift Error will then be close to 0 when inferences match the baseline distribution, and approach 1 when inference results diverge from the set baseline.
Data Drift is when the input data to a model diverges significantly from the data the model was trained on.
For example, in a computer vision application, a model can be trained to detect and classify traffic signs the United States, perhaps under ideal environmental and weather conditions. When the model is used in the real world, images of traffic signs that are worn or damaged, taken under rainy conditions, or in a different country where the designs, symbols, and/or markings are different are used, resulting in misclassifications. The ability to detect these changes or drift in the incoming data stream and notify or alert a user or developer of the drift to avoid potential operational risks or hazards is essential.
Modzy's input drift detection system first submits a sample of the training data to the model and observes what is happening inside the model itself, in the 'neurons', or extraction layers, of the model, to generate a mathematical characterization of the state of each of those 'neurons'. In production, live data is processed by the model and Modzy's system observes the same internal layers of the model. If the same 'neurons' light up, then the data is familiar to the model and you can be confident that the production data is similar enough to the training data to produce a trustworthy result. If, instead, a significant difference in the model's internal state is observed, then we should question the resulting inference as the data doesn't match what the model was trained on and the result could be wrong. We compute a data drift score by calculating a mathematical transform or a "distance" between these model states – if the distance is great, the production data is different from the training data.
Concept drift is a shift in reality, in the real-world relationship between the model inputs and the output. For example, Bees are legally fish in California – this is a brand new concept, one that will change every future classification. This is not a shift in the model or the data, it is a shift in the model's real world operating environment changing the concepts or definitions of "bee" and "fish".
Updated over 1 year ago