Python sample from template


This sample containerizes a basic image classification model. It uses the Modzy Python Template repository as a starting point.

The sample covers:

  1. Template adjustments to fit a model and the Dockerfile addition.

  2. Model script completion.

  3. Build and completion a YAML file.

  4. Unit tests creation and container validation.



This process requires the installation of Docker, Python and the Model Project Template Repository.

The Modzy Python template repository

This repository serves as a starting point to package a Python model in a Docker container that meets the Modzy API specifications. It adds a run method to the model to run inference jobs.

Modzy Python template repositoryModzy Python template repository

Modzy Python template repository

Repository contents

The template comes with a Web micro-framework, an HTTP object generator code, Gunicorn, and preset Dockerfile files required to run the model.

Here is the full content list:




Contains model library content such as demo inputs and outputs, documentation, etc.


A utility package that implements the container specification API with Flask.


An example model library package.


Contains the class that wraps the model logic into an interface that the flask_psc_model package can understand.


A set of unit tests.

The model app definition. Here, we wrap the model defined in model_lib with the utilities from flask_psc_model. //


The app container definition.

The script used to start the app server inside the container.

The Gunicorn web server configuration file used in the Docker container.


The model metadata containing documentation and technical requirements.


Pinned Python library dependencies for reproducible environments.

Edit template scripts

This guide goes over template adjustments and the Dockerfile addition.

Add a Dockerfile

Base Dockerfile

Select a base Docker image from the template Dockerfile that fits the different machine learning framework or dependencies the model requires to run. This ensures that the environment in the model container can run the model because it has the correct version of frameworks, Python, CUDA, and others.

Docker Hub holds thousands of pre-built Docker images to choose from as the base image in the Dockerfile. This template uses a vanilla Python 3.6 image.

In this sample, we start our Dockerfile with a base Docker image that contains the correct versions of Python, Pytorch, and CUDA that our model requires to run.

Add your dependencies

Navigate to the requirements.txt file and add any additional model dependencies. The Dockerfile uses the pip package-management system to install all libraries in the requirements.txt file. Keep all existing requirements:

itsdangerous==1.1.0       # via flask
jinja2==2.10.1            # via flask
markupsafe==1.1.1         # via jinja2
pyreadline==2.1           # via humanfriendly
werkzeug==0.15.4          # via flask

In this sample, we only require one package that does not come pre-installed in our base docker image: Pillow.

Build the Dockerfile

Open a command-line interface to build the Docker image.

Use the -t argument and tag the Docker image with an intuitive and representative model name. The last argument to this build command specifies the file path to the Dockerfile.

docker build -t image-classification .

In this sample, we navigate in the command line to the same directory as our Dockerfile. We specify our current working directory path with a period.

Run the Dockerfile

Run the container to ensure the Dockerfile correctly satisfies the model requirements.

Use a docker run command to spin up the container:

docker run --rm -it --runtime=nvidia -v $(pwd):/work -w /work image-classification bash

In this sample, we run a command that opens an interactive container where we can test our model code in the environment to ensure it executes properly.

This command contains the following arguments:


Ensures the container is removed when exited.

-it & bash

Creates an interactive bash shell in the container.


Allows the container to access GPUs.


Allows to mount the working directory to a directory inside the container (/work).


Sets the working directory to /work. When this command executes, an interactive bash shell is set in the /work directory.

Because the current working directory is mounted to the container, all changes performed to the code carry over to the container’s /work directory. This allows you to make code changes and test without having to rebuild the image and run the container after every change.

Complete the model script

The script holds all the model code including the loading of weights and dependencies, variable instantiation, and inference code.

Add requirements

Paste in any library requirements the model code needs to successfully execute. Do not delete the existing import statements. Edit the THIS_DIR directory path if necessary and add any other relevant paths to the model weights, helper scripts, lookup files, etc.

import json
import os

import sys
import ast
import torch
from PIL import Image
from torchvision import models, transforms

from flask_psc_model import ModelBase, load_metadata

# create data directory
root = os.path.dirname(os.path.abspath(__file__))
labels_path = os.path.join(root, 'imagenet_classes.txt')

In this sample, we import Pytorch and a few other dependencies. We also specify the path to our class labels, which exist in a file we created in our directory called imagenet_classes.txt.


Use the __init__ function to load any model weights and initialize model object class attributes:

def __init__(self):
    """Load the model files and do any initialization.

    A single instance of this model class will be reused multiple times to perform inference
    on multiple input files so any slow initialization steps such as reading in a data
    files or loading an inference graph to GPU should be done here.

    This function should require no arguments, or provide appropriate defaults for all arguments.

    NOTE: The `__init__` function and `run` function may not be called from the same thread so extra
    care may be needed if using frameworks such as Tensorflow that make use of thread locals.

    self.model = models.resnet101(pretrained=True)

    # labels
    with open(labels_path, 'r') as f:
        self.labels = ast.literal_eval(

    # define data transform
    self.transform = transforms.Compose([
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

In this sample, we download a pre-trained image classification model available via Pytorch’s torchvision library. We also use the __init__ function to read in our class labels and define a data transformation.

Add any helper functions to the model class if necessary:

def preprocess(self, image):

    # do data transformation
    img_t = self.transform(image)
    batch_t = torch.unsqueeze(img_t, 0)

    return batch_t

def postprocess(self, predictions):

    percentage = torch.nn.functional.softmax(predictions, dim=1)[0]

    _, indices = torch.sort(predictions, descending=True)
    top5_preds = [(self.labels[idx.item()], percentage[idx].item()) for idx in indices[0][:5]]

    return top5_preds

In this sample, we create a data preprocessing function and a predictions post-processing function to make our inference process easier to follow.

Add inference code

Insert any required inference code into the run method:

def run(self, input_path, output_path):
    """Run the model on the given input file paths and write to the given output file paths.

    The input files paths followed by the output file paths will be passed into this function as
    positional arguments in the same order as specified in `input_filenames` and `output_filenames`.
    # read in data
    image =

    # data preprocessing
    img = self.preprocess(image)

    # perform inference
    output = self.model(img)

    # post process
    results = self.postprocess(output)

    # save output
    results = {'results': results}

    with open(output_path, 'w') as out:
        json.dump(results, out)

In this sample, we include five steps: data ingest, data preprocessing, inference, prediction post-processing, and output saving.

Update the model name

Adjust the template to fit your model.

Edit the code below the if __name__ == “__main__” line at the bottom of the script.

Open and update the ModelName class to a preferred model name:

class ModelName(ModelBase):

    #: load the `model.yaml` metadata file from up the filesystem hierarchy;
    #: this will be used to avoid hard-coding the below filenames in this file
    metadata = load_metadata(__file__)

    #: a list of input filenames; specifying the `input_filenames` attribute is required to configure the model app
    input_filenames = list(metadata.inputs)

    #: a list of output filenames; specifying the `output_filenames` attribute is required to configure the model app
    output_filenames = list(metadata.outputs)

In the same file, update any other ModelName instances to the name of the base class defined in

if __name__ == '__main__':
    # run the model independently from the full application; can be useful for testing
    # to run from the repository root:
    #     python -m model_lib.model /path/to/input.txt /path/to/output.json
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('input', help='the input data filepath')
    parser.add_argument('output', help='the output results filepath')
    args = parser.parse_args()

    model = ModelName(), args.output)

In the script in the parent directory of the repository, update any other ModelName instances to the name of the base class defined in

#!/usr/bin/env python3
"""The HTTP API application.

The model class is used to configure the Flask application that implements the
required HTTP routes.

from flask_psc_model import create_app
from model_lib.model import ModelName

app = create_app(ModelName)

if __name__ == '__main__':

In the tests/ script, update the two instances of ModelName as above:

import unittest

from model_lib.model import ModelName
from flask_psc_model import ModelBase

class TestModel(unittest.TestCase):

    def setUp(self):
        self.model = ModelName()

In the container spun up at the end of the Dockerfile step, test the script. Use a command-line Python command. Ensure this command abides by the specifications in your if __name__ == “__main__” code:

python -m model-lib.model ./data/dog.jpg ./results.json

In this sample, we run our model code script as a module (-m argument), specify the input file path to dog.jpg, and indicate we want our output written to a file called results.json.

The YAML file

The YAML file contains model metadata.

The top portion of this file allows you to document any important information about the model. It includes a description of the task it performs, what data it trained on, how it performed on validation datasets, etc.

The bottom portion of this file specifies the inputs to the model, hardware and memory requirements to run it, and runtime timeout thresholds.

Build a YAML file

Go to the model.yaml file in the template repository.

YAML is a human-readable data serialization language commonly used for configuration files.

In this case, the model.yaml configuration file contains the model metadata that the Modzy API requires to spin up the model container and run inference.


Set the input file name

To run the model container via Modzy, the input file must match the name specified here.

In this sample, we call the input file image. We omit the file extension because our model supports different filetypes.

Set the input file types

Select the supported input file types included as MIME types. Add them under the acceptedMediaTypes section.

In this sample, the model accepts JPEG and PNG encoded images.

Set the input max size

Set the maximum amount of data the model can process per input item.

In this sample, the model accepts an image up to 1 megabyte in size.

Set an input description

Add a short input description in the description section. Include input item details such as options, dependencies, requirements, and other special considerations.

If the model requires multiple inputs to run, each input must hold its own description section under the inputs section:

# Please indicate the names and kinds of input(s) that your model
# expects. The names and types you specify here will be used to
# validate inputs supplied by inference job requests.
  # The value of this key will be the name of the file that is
  # supplied to your model for processing
    # The expected media types of this file. For more information
    # on media types, see:
    - image/jpeg
    - image/png
    # The maximum size that this file is expected to be.
    maxSize: 1M
    # A human readable description of what this file is expected to
    # be. This value supports content in Markdown format for including
    # rich text, links, images, etc.
    description: Image file to be classified with model.
    # Accepted image types: jpeg or png encoded images.
    # filename
    # The expected media types of this file. For more information
    # on media types, see:
    - application/json
    # The maximum size that this file is expected to be.
    maxSize: 1M
    # A human readable description of what this file is expected to
    # be. This value supports content in Markdown format for including
    # rich text, links, images, etc.
    description: Configuration file that tells the model which classification to execute.

In this sample, the model requires an image and a configuration file.


Repeat the input steps above, this time for the outputs section:

    # The expected media types of this file. For more information
    # on media types, see:
    mediaType: application/json
    # The maximum size that this file is expected to be.
    maxSize: 1M
    # A human readable description of what this file is expected to
    # be. This value supports content in Markdown format for including
    # rich text, links, images, etc.
    description: Top five classifications with their respective prediction probabilities.

In this sample, we output a JSON file called results.json. This file contains the top five image class predictions along with their respective prediction probabilities.


Lastly, complete the resources section of the YAML file. Include the memory, CPU, and GPU amounts required to run the model:

# The resources section indicates what resources are required by your model
# in order to run efficiently. Keep in mind that there may be many instances
# of your model running at any given time so please be conservative with the
# values you specify here.
    # The amount of RAM required by your model.
    size: 512M
    # CPU count should be specified as the number of fractional CPUs that
    # are needed. For example, 1 == one CPU core.
    count: 1
    # GPU count must be an integer.
    count: 0
# Please specify a timeout value that indicates a time at which
# requests to your model should be canceled. If you are using a
# webserver with built in timeouts within your container such as
# gunicorn make sure to adjust those timeouts accordingly.
  # Status timeout indicates the timeout threshold for calls to your
  # model's `/status` route.
  status: 20s
  # Run timeout indicates the timeout threshold for files submitted
  # to your model for processing.
  run: 20s

In this sample, the model container runs with 512 megabytes of memory on 1 CPU. It has timeout thresholds for 20 seconds each.

If after 20 seconds of spinning up this model container, there is not a successful response from the GET /status route, the model container shuts down. Similarly, if after 20 seconds of calling the POST /run route, the container does not return an output, the model container shuts down.

Unit tests and validation

Unit tests

Unit tests validate if the containerized model works as intended.


This step is not required but is strongly encouraged.

Create a subdirectory called /data under the tests/ directory.

In /data, create two types of validation tests: one for successful data examples and one for validation error tests. In the tests/data directory, create two more subdirectories: /example and /validation-error.

mkdir data
cd data/
mkdir example
mkdir validation-error

Within each subdirectory, create folders titled by the name of the test.

cd example/
mkdir 001 002 003
cd ../validation-error
mkdir invalid-file invalid-size empty-file

In this sample, we include three correct example tests (0001/, 0002/, and 0003/). We also include three validation error tests (invalid-file/, invalid-size/, and empty-file/).

The template repository contains code that allows the Python unittest module to execute.

This unittest module iterates over the different subdirectories inside the tests/data/ directory. Each test subdirectory within the tests/data/example/ directory must contain:

  1. the sample data input or inputs the model expects, named according to the YAML file.

  2. the expected model output for the respective input(s).

Upon execution, the unittest module runs the model against the input, take the run’s actual output, and compares it to the expected output file located in each example test folder. If the actual and expected output match, then the unittest module returns an OK message. If they do not match, it returns a FAILURE message.

The unittest module behaves similarly when it iterates over the tests within the tests/data/validation-error/ directory. Here, each test folder must contain:

  1. the sample data input or inputs the model expects, named according to the YAML file.

  2. a file called message.txt that holds the expected error message for the respective input(s).

Upon execution, the unittest module runs each input per folder against the model and compares the actual output (in this case an error message) with the expected error message. If they match, then the unittest module returns an OK message. If they do not match, it returns a FAILURE message.

In the running container, test the unittest module via a Python command-line command.

python -m unittest

If this module returns an OK message, the unit tests have run successfully. Continue to the final validation test to ensure the model container works correctly.

Container validation

The validation test ensures the model container works correctly.

Rebuild the Docker image

Exit the running container and remove the Docker image built at the beginning:

docker image rm image-classification

Up until this point, the unit tests that run interactively inside the container confirm the code changes work. However, since Docker images are immutable, the Docker image did not capture these changes. When the Modzy API spins up the model container, no interactive shell environment exists.

Rebuild the Docker image to capture all the changes:

docker build -t image-classification-final .

Test the container

To test the container, simulate the Modzy API spinning it up.

Run the container on a local available port:

docker run --runtime=nvidia --name Image-Classification -e PSC_MODEL_PORT=8080 -p 8080:8080 -v /data:/data -d image-classification-final

In this sample, we run on port 8080.

This command contains the following arguments:


Assigns a name to the container.


Sets the PSC_MODEL_PORT environment variable to a port inside the container.


Publishes the container’s port to the host. It maps the container host (8080) to the host on the local machine (which in this case is also 8080).


Allows the container to access GPUs.


Allows mounting a directory holding the data to pass to the model to a directory inside the container (/data).


Runs the container in the background and prints out the container ID.

Check the container’s status endpoint:

curl -s "http://localhost:8080/status"

This should return a 200 OK status:


Run an inference test providing the input file path. The input file name needs to match the YAML file:

curl -s -X POST -H "Content-Type: application/json" \
    --data "{\"type\":\"file\",\"input\":\"/data\",\"output\":\"/data\"}" \

In this sample, we have an image file in the /data.

This should also return a 200 OK status.


Validate that an output file was written to the /data directory and shut down the container:

curl -s -X POST "http://localhost:8080/shutdown"

Clean up the environment and remove the container:

docker rm <insert container name here>

Did this page help you?