Packaging Your First Model


Not available for Modzy Basic Accounts

Please note that this feature is not available for Modzy Basic accounts. Please contact sales if you're interested in trying out a fully-featured version of Modzy.

In this guide, we will prepare a machine learning model for deployment to Modzy using our Open Source Container specification. If your model is written in Python, use one of our two templates (using different protocols) to make the packaging process easier. View these templates on our GitHub page:

Each template contains skeleton code that will help you get started. For more information on the raw specifications for each protocol, visit the model packaging section of our API reference page.

We will complete the containerization process in three steps:

  1. Construct Model Container
  2. Construct Metadata Configuration File
  3. Test and Validate Model Container

:construction: Construct Model Container

Edit skeleton template scripts and Dockerfile to build a Modzy-compatible Docker image that contains all model code and dependencies.

gRPC Template

Migrate your existing Model Library or Develop a Model Library from Scratch

Use model_lib/src to store your model library and use model_lib/tests in order to store its associated test suite. Your existing model library can be directly imported into this repository with any structure, however, you are required to expose functionality to instantiate and perform inference using your model at a minimum. For developers, it is recommended that the complete training code as well as the model architecture be included and documented within your model library in order to ensure full reproducibility and traceability.

Integrate your Model into the Modzy Model Wrapper Class

Navigate to the model_lib/src/ file within the repository, which contains the Modzy Model Wrapper Class. Proceed to fill out the __init__() and handle_discrete_input() by following the instructions provided in the comments for this module.

Host gRPC server inside a Docker Container

Set up the Dockerfile correctly to ensure your gRPC model server can be spun up inside your Docker container.


  • Complete the handle_discrete_input_batch() method in order to enable custom batch processing for your model.
  • Refactor the ExampleModel class name in order to give your model a custom name.

HTTP Template

Migrate your existing Model Library

Update model_lib/ with your model code while maintaining the template base class structure. Include all model instantiation, environment variable definitions, and other one-time loading processes under the constructor __init__() method of the model base class ModelName(ModelBase). Additionally, include any inference-specific code within the run(self, input_path, output_path method, where your model should process the input data passed through as the input_path parameter, make predictions, and write the model output using the output_path parameter.

Construct Dockerfile

Update the requirements.txt file with any required dependencies for your model, then update the Dockerfile with all of the model application's code, data, and runtime dependencies.

:page-with-curl: Construct Metadata Configuration File

Fill in YAML configuration file from template that contains important metadata the API uses to run the model on the Modzy Platform.

gRPC Template

Provide model Metadata

Create a new version of your model using semantic versioning, x.x.x, and create a new directory for this version under asset bundle. Fill out a model.yaml and docker_metadata.yaml file under asset_bundle/x.x.x/ according to the proper specification and then update the __VERSION__ = x.x.x variable located in grpc_model/ prior to performing the release for your new version of the model. Also, you must update the following line in the Dockerfile: COPY asset_bundle/x.x.x ./asset_bundle/x.x.x/

HTTP Template

Complete model.yaml file, making sure the following sections are fully completed:

# Please indicate the names and kinds of input(s) that your model
# expects. The names and types you specify here will be used to
# validate inputs supplied by inference job requests.
  # The value of this key will be the name of the file that is
  # supplied to your model for processing
    # The expected media types of this file. For more information
    # on media types, see:
    # The maximum size that this file is expected to be.
    # A human readable description of what this file is expected to
    # be. This value supports content in Markdown format for including
    # rich text, links, images, etc.

# Please indicate the names and kinds of output(s) that your model
# writes out.
    # The expected media types of this file. For more information
    # on media types, see:
    # The maximum size that this file is expected to be.
    # A human readable description of what this file is expected to
    # be. This value supports content in Markdown format for including
    # rich text, links, images, etc.
    description: |
# The resources section indicates what resources are required by your model
# in order to run efficiently. Keep in mind that there may be many instances
# of your model running at any given time so please be conservative with the
# values you specify here.
    # The amount of RAM required by your model, e.g. 512M or 1G
    # CPU count should be specified as the number of fractional CPUs that
    # are needed. For example, 1 == one CPU core.
    # GPU count must be an integer.
# Please specify a timeout value that indicates a time at which
# requests to your model should be canceled. If you are using a
# webserver with built in timeouts within your container such as
# gunicorn make sure to adjust those timeouts accordingly.
  # Status timeout indicates the timeout threshhold for calls to your
  # model's `/status` route, e.g. 20s
  # Run timeout indicates the timeout threshhold for files submitted
  # to your model for processing, e.g. 20s

:white-check-mark: Test and Validate Model Container

Add unit tests and run validation tests on model container analogously to the way the Modzy Platform will spin up the model container and run inference.

gRPC Template

Start your model inside a container via the gRPC model server:

docker build -t <container-image-name>:<tag> .
docker run --rm --name <container-name> -it -p 45000:45000 <container-image-name>:<tag>

Then, in a separate terminal, test the containerized server from a local gRPC model client:

poetry run python -m grpc_model.src.model_client

Pending a successful local client test, your gRPC model is successfully packaged.

HTTP Template

Build the app server image:

docker build -t <container-image-name>:<tag> .

Run the app server container on port 8080:

docker run --name <container-name> -e PSC_MODEL_PORT=8080 -p 8080:8080 -v $(pwd)/data:/data -d <container-image-name>:<tag>

Where $(pwd)/data contains sample input data with the same naming convention defined in the model.yaml file.

Check the container's status:

curl -s "http://localhost:8080/status"

Run some inference jobs. Send the mounted data from the /data container directory to the model for inference:

With cURL:

echo ffaa00 > /data/input.txt
curl -s -X POST -H "Content-Type: application/json" \
    --data "{\"type\":\"file\",\"input\":\"/data\",\"output\":\"/data\"}" \
cat /data/results.json

Stop the app server:

curl -s -X POST "http://localhost:8080/shutdown"

Check that the exit code is 0:

docker inspect <container-name> --format="{{.State.Status}} {{.State.ExitCode}}"

Cleanup the exited Docker container:

docker rm <container-name>

:open-file-folder: Examples

For reference, view the following fully-packaged Modzy-compatible implementations of each template:

Did this page help you?