This guide demonstrates the process of automatically containerizing your PMML model.

📘

What you will need

  • Dockerhub account
  • Connection to running Chassis.ml service (either from a local deployment or via publicly-hosted service)
  • Trained PMML model that can be loaded into memory or code to train a PMML model from scratch
  • Python environment

NOTE: To follow along, you can reference the Jupyter notebook example and data files here.

Set Up Environment

👍

We recommend you follow this guide using a Jupyter Notebook. Follow the appropriate install instructions based on your environment.

Create a Python virtual environment and install the python packages required to load and run your model. At a minimum, pip install the following packages:

pip install chassisml modzy-sdk

If you would like to follow this guide directly, pip install the following additional packages:

scikit-learn>=1.0.2
pandas>=1.4.2
numpy>=1.22.3

Load Model into Memory

If you plan to use the Chassis service, you must first load your model into memory. If you have your trained model file saved locally (.pth, .pkl, .h5, .joblib, or other file format), you can load your model from the weights file directly, or alternatively train and use the model object.

import chassisml
import numpy as np
import json
from io import StringIO
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier

# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)

# Create sample data for testing later
with open("data/sample_iris.csv", "w") as f:
    Xte[:10].to_csv(f, index=False)

# Load model
clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
labels = clf.classes_.tolist()

# Test model
clf.predict(Xte)
clf.score(Xte, yte)

def preprocess_inputs(raw_input_bytes):
    # load data
    inputs = pd.read_csv(StringIO(str(raw_input_bytes, "utf-8")))
    return inputs

def postprocess_outputs(raw_predictions):
    # process output
    inference_result = {
        "result":[
            {
                "row": i+1,
                "classPredictions": [
                    {"class": labels[idx], "score": results[idx]}
                    for idx in np.argsort(results)[::-1]
                ]  
            } for i, results in enumerate(raw_predictions)
        ] 

    }    


    # format output
    structured_output = {
        "data": {
            "result": inference_result["result"],
            "explanation": None,
            "drift": None,
        }
    }

    return structured_output

Define process Function

You can think of this function as your "inference" function that will take input data as raw bytes, process the inputs, make predictions, and return the results. This method is the sole parameter required to create a ChassisModel object.

def process(input_bytes):
    # load data
    inputs = preprocess_inputs(input_bytes)
    # make predictions
    output = clf.predict_proba(inputs)
    # process output
    structured_output = postprocess_outputs(output)
    return structured_output

Create ChassisModel Object and Publish Model

First, connect to a running instance of the Chassis service - either by deploying on your machine or by connecting to the publicly hosted version of the service). Then, you can use the process function you defined to create a ChassisModel object, run a few tests to ensure your model object returns the expected results, and finally publish your model.

chassis_client = chassisml.ChassisClient("http://localhost:5000")
chassis_model = chassis_client.create_model(process_fn=process)

Define sample file from local filepath and run a series of tests.

NOTE: test_env method is not available on publicly-hosted service.

sample_filepath = './data/sample_iris.csv'
results = chassis_model.test(sample_filepath)
print(results

test_env_result = chassis_model.test_env(sample_filepath)
print(test_env_result)

Define your Dockerhub credentials and publish your model.

dockerhub_user = <my.username>
dockerhub_pass = <my.password>
modzy_url = <modzy.base.url> # E.g., https://trial.app.modzy.com/api
modzy_api_key = <my.modzy.api.key> # E.g., 8eba4z0AHqguxyf1gU6S.4AmeDQYIQZ724AQAGLJ8

response = chassis_model.publish(
   model_name="PMML Random Forest Iris Classification",
   model_version="0.0.1",
   registry_user=dockerhub_user,
   registry_pass=dockerhub_pass
)

job_id = response.get('job_id')
final_status = chassis_client.block_until_complete(job_id)

You have successfully completed the packaging of your ONNX model. In your Dockerhub account, you should see your new container listed in the "Repositories" tab.

14381438

Figure 1. Example Chassis-built Container

Congratulations! In just minutes you automatically created a Docker container with just a few lines of code. To deploy of your new model container to Modzy, follow the Import Container guide.






Complete Documentation and Begin Running Model

🚧

Note

By including your Modzy credentials in the chassis_model.publish function execution (i.e., modzy_url, modzy_api_key, and modzy_sample_input_path), the Chassis service takes care of fully deploying your model for you. If you choose this option, follow the remainder of this guide to finalize your deployment. However, if you instead wished to simply auto-containerize your model, push the model container to a Docker registry, and manually import that model, follow these instructions to finalize the deployment.

Click "Edit Model" to add documentation and tags to your newly deployed model. This will ensure other team members within your organization can discover this model and decide whether or not it fits their use case.

42344234

Figure 1. Edit Model

Edit the "Add documentation" section to add the following documentation to your model:

  • Description: Few sentence summary of your model. Include the task your model accomplishes and brief information about expected inputs and outputs.
  • Performance Overview: Few sentence overview of how your model was evaluated and any relevant performance metrics captured during the training and validation processes.
  • Performance Metrics: Add metrics you computed during training that will be displayed on your model home page.
  • Transparency and bias reporting: Technical details for your model, such as information about your model’s design, architecture, training data, development approach, etc.
43584358

Figure 2. Add documentation

When you have finished, move on to the "Assign tags and categories" section. Adding tags will make your model more accessible and discoverable within your organization's Modzy model library.

18361836

Figure 3. Assign tags

Congratulations! In just minutes, you have successfully packaged, deployed, and documented your machine learning model. Check out our Running Inferences guides to start using your model right away.


What’s Next
Did this page help you?