1. Package Model


Self-Service Tutorial Contents

  1. :arrow-right: Package Model
  2. Deploy Model
  3. Scale Model Up
  4. Run Model Inference
  5. Set Drift Baseline
  6. Deploy Model to Edge Device

Prepare your environment

In this first tutorial of our end-to-end Modzy tutorial series, we will begin by containerizing a pre-trained model. To do so, we will leverage a convenient open-source tool called Chassis.


What you'll need for this tutorial

  • A Python environment (Python >= 3.6)
  • A free Dockerhub account
  • We recommend following this tutorial in a Jupyter notebook, but any IDE will work

After your Python environment is set up, create a virtual environment (venv, conda, or your virtual environment of choice), and install Jupyter Notebooks using the appropriate install instructions.

Next, use pip to install the following packages:

pip install chassisml torch transformers[torch] numpy

With your environment set up, open a Jupyter Notebook kernal from your terminal.

jupyter notebook

The remainder of this tutorial will be executed within this notebook.

Download model from Hugging Face

In this tutorial, we will take advantage of the Hugging Face model library and package a TinyBERT text classification model.

To start, download the model and save it to your machine by adding this code snippet to your notebook:

# import packages
import os
import time
import torch
import chassisml
import numpy as np
from transformers import BertTokenizer, BertForSequenceClassification

# download TinyBERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained("gokuls/BERT-tiny-emotion-intent")
model = BertForSequenceClassification.from_pretrained("gokuls/BERT-tiny-emotion-intent")

# save model locally so we can use/access it with Chassisml package

# create sample text input and save it as a text file
text_file = open("input.txt", "w")
n = text_file.write("This is my first time using Modzy!")

Prepare model for Chassis

Now that our pre-trained model is downloaded from Hugging Face, we will format our model in a format to package up with Chassis.

Copy the below code snippet into your notebook to load the model into memory, define labels, and create an inference function we will call process.

# create labels to use in process function
labels = model.config.id2label
mapped_labels = {"LABEL_0": 'sadness',"LABEL_1": 'joy',"LABEL_2": 'love',"LABEL_3": 'anger',"LABEL_4": 'fear',"LABEL_5": 'surprise'}

# load model to memory
tinybert_tokenizer = BertTokenizer.from_pretrained("./tiny-bert-model")
tinybert_model = BertForSequenceClassification.from_pretrained("./tiny-bert-model")

# define process function that will serve as our inference function
def process(input_bytes):
    # decode and preprocess data bytes
    text = input_bytes.decode()
    inputs = tinybert_tokenizer(text, return_tensors="pt")
    # run preprocessed data through model
    with torch.no_grad():
        logits = tinybert_model(**inputs).logits
        softmax = torch.nn.functional.softmax(logits, dim=1).detach().cpu().numpy()
    # postprocess 
    indices = np.argsort(softmax)[0][::-1]
    results = {
        "data": {
            "result": {
                "classPredictions": [{"class": mapped_labels[labels[i]], "score": softmax[0][i]} for i in indices]
    return results

Connect to Chassis and test model

Next, connect to the publicly-hosted Chassis service, create a ChassisModel object, and test your model with the below code snippet:

# initialize Chassis client
chassis_client = chassisml.ChassisClient("https://chassis.app.modzy.com")

# create Chassis model
chassis_model = chassis_client.create_model(process_fn=process)

# test Chassis model locally (can pass filepath, bufferedreader, bytes, or text here):
sample_filepath = './input.text'
results = chassis_model.test(sample_filepath)

If successful, you should see an output that looks like this in your notebook:


Publish model to Dockerhub

Finally, you can now publish your model with your Dockerhub credentials. Simply modify the DOCKER_USER and DOCKER_PASS variables with your own credentials (lines 2-3) and run the below code in your notebook.

# create variables for publish method
DOCKER_USER = "<insert-dockerhub-username>"
DOCKER_PASS = "<insert-dockerhub-password>"

# publish model to Dockerhub
start_time = time.time()
response = chassis_model.publish(

# wait for job to complete and print result # 
job_id = response.get('job_id')
final_status = chassis_client.block_until_complete(job_id)
end_time = time.time()
if final_status['status']['succeeded'] == 1:
    print("Job Completed in {} minutes. View your new container image here: https://hub.docker.com/repository/docker/{}/{}".format((end_time - start_time)/60, DOCKER_USER, "-".join(MODEL_NAME.lower().split(" "))))
    print("Job Failed in {} minutes. See logs below:\n\n{}".format((end_time - start_time)/60, final_status['logs']))

After a few minutes, your container will be built and pushed to your Dockerhub account. In the next tutorial, learn how to deploy this to your Modzy account!

What’s Next