Deploy a Model with Modzy Python SDK

Follow this guide to automatically deploy your machine learning model

This guide walks through the process of leveraging Modzy's Python SDK to programmatically deploy your model to your private model library in Modzy. This programmatic deployment method also serves as the building blocks to deloying models via CI/CD.

📘

What you will need

Set Up Environment

Create a Python virtual environment and install the Modzy Python SDK to deploy your model.

pip install modzy-sdk

Import Modzy SDK and Initialize Client

Insert your instance URL and API Key to establish connection to the Modzy API Client

# Import Modzy SDK
from modzy import ApiClient, error

# the url we will use for authentication
'''
Note: To use this example replace MODZY_URL with the url of your instance of Modzy
'''
BASE_URL = "MODZY_URL"
# the api key we will be using for authentication -- make sure to paste in your API access key below
API_KEY = "<your.api.key>"
    
# setup our API Client
client = ApiClient(base_url=BASE_URL, api_key=API_KEY)

Deploy Model

The client.models.deploy() method expects four required parameters at a minimum:

  • container_image (str): This parameter must represent a container image repository & tag name, or in other words, the string you would include after a docker pull command. For example, if you were to download this container image using docker pull modzy/grpc-echo-model:1.0.0, include just modzy/grpc-echo-model:1.0.0 for this parameter
  • model_name: The name of your model you would like to deploy
  • model_version: The version of your model you would like to deploy
  • sample_input_file: Filepath to a sample piece of data that your model is expected to process and perform inference against.

This method does include many optional parameters to document metadata that describes your model, lists the performance metrics of your model, and defines details about your model's expected inputs and outputs. See SDK reference documentation for more information.

🚧

Note: If you are trying to deploy a new version to a model that already exists in your Modzy instance, make sure to include the optional model_id parameter in your function call, and include your existing model's identifier as the value.

model_data = client.models.deploy(
    container_image="modzy/grpc-echo-model:1.0.0",
    model_name="Echo Model",
    model_version="0.0.1",
    sample_input_file="./test.txt",
    run_timeout="60",
    status_timeout="60",
    short_description="This model returns the same text passed through as input, similar to an 'echo.'",
    long_description="This model returns the same text passed through as input, similar to an 'echo.'",
    technical_details="This section can include any technical information abot your model. Include information about how your model was trained, any underlying architecture details, or other pertinant information an end-user would benefit from learning.",
    performance_summary="This is the performance summary."
)

print(model_data)

Finally, view the output returned from the deployment method to navigate to your newly deployed model biography page. The final key-value pair to this JSON includes the URL to your model (container_url).

{
    "model_data": {
        "version": "0.0.1",
        "createdAt": "2022-08-16T01:10:52.821+00:00",
        "updatedAt": "2022-08-16T01:10:53.498+00:00",
        "inputValidationSchema": "",
        "timeout": {
            "status": 60000,
            "run": 60000
        },
        "requirement": {
            "requirementId": 1
        },
        "containerImage": {
            "uploadStatus": "IN_PROGRESS",
            "loadStatus": "IN_PROGRESS",
            "uploadPercentage": 0,
            "loadPercentage": 0,
            "containerImageSize": 0,
            "repositoryName": "thjg0zuntf"
        },
        "inputs": [
            {
                "name": "input",
                "acceptedMediaTypes": "application/json",
                "maximumSize": 1000000,
                "description": "Default input data"
            }
        ],
        "outputs": [
            {
                "name": "results.json",
                "mediaType": "application/json",
                "maximumSize": 1000000,
                "description": "Default output data"
            }
        ],
        "statistics": [],
        "isActive": false,
        "longDescription": "Long Description",
        "technicalDetails": "Techincal Details",
        "isAvailable": true,
        "status": "partial",
        "performanceSummary": "Performance summary",
        "model": {
            "modelId": "thjg0zuntf",
            "latestVersion": "0.0.1",
            "latestActiveVersion": "",
            "versions": [
                "0.0.1"
            ],
            "author": "Integration",
            "name": "Echo Model",
            "description": "Short Description",
            "permalink": "thjg0zuntf-integration-echo-model",
            "features": [],
            "isActive": false,
            "isRecommended": false,
            "isCommercial": false,
            "tags": [],
            "createdByEmail": "[email protected]",
            "createdByFullName": "First Last",
            "visibility": {
                "scope": "PRIVATE"
            }
        },
        "processing": {
            "minimumParallelCapacity": 0,
            "maximumParallelCapacity": 1
        },
        "originSidecar": false
    },
    "container_url": "https://modzy-instance.app.modzy.com/models/thjg0zuntf/0.0.1"
}

ARM Models for Edge

The above sections in this guide outline the workflow for executing a typical model deployment process with Modzy. In this deployment process, Modzy spins up a processing engine, loads your model, runs a sample inference, and performs a few other validation tests to ensure your model meets the proper specification and can run as expected. This process expects your model container is compiled for hardware with an amd64 (Intel) chip. As a result, if you compile your model for an arm chip (relevant for most edge-like devices) or a different chip set, there is a slightly different approach you must take to deploying your model. This section demonstrates the process of deploying via the Python SDK* and via the Modzy UI.

Deploying with Python SDK Highly recommended

🚧

ARM Model Considerations

When you deploy a model compiled for an arm chip, Modzy skips several validation tests otherwise performed with models compiled for amd64 chips. This results in a couple consequences:

  • It is possible to deploy a model all the way to an edge device that does not properly load or run - we recommend testing your container on an instance of the target architecture to avoid this scenario.
  • Modzy will not automatically infer information about the model's expected inputs and outputs, and as a result, the information on your model's API page will include default values. This means copy-pasting this code to run your model(s) will not work. Using the Python SDK deployment route is a viable workaround (see more below).

Python SDK Method

Deploying a non-amd64 model with Modzy's Python SDK follows a nearly identical process as outlined in the above sections. Models compiled for the following architectures are supported by Modzy today:

  • amd64: All Intel chipsets
  • arm64: Modern ARM chipsets
  • arm (arm32): Older or smaller ARM devices

To specify your ARM chip type, simply make a small change to the deploy method:

inputs = [
  {
    "name": "input",
    "acceptedMediaTypes": "text/plain",
    "maximumSize": 1000000,
    "description": "Custom input data description"    
  }
]

outputs = [
  {
    "name": "results.json",
    "acceptedMediaTypes": "application/json",
    "maximumSize": 1000000,
    "description": "Custom output data description"    
  }
]

model_data = client.models.deploy(
    container_image="modzy/grpc-echo-model:1.0.0",
    model_name="Echo Model",
    model_version="0.0.1",
    architecture="arm64",
    run_timeout="60",
    status_timeout="60",
    short_description="This model returns the same text passed through as input, similar to an 'echo.'",
    long_description="This model returns the same text passed through as input, similar to an 'echo.'",
    technical_details="This section can include any technical information abot your model. Include information about how your model was trained, any underlying architecture details, or other pertinant information an end-user would benefit from learning.",
    performance_summary="This is the performance summary.",
    input_details=inputs,
    output_details=outputs,
)

print(model_data)

Notice a few small changes from the example in the above section:

  • The architecture parameter is set to arm64. By default, this parameter is set to amd64 and is not shown in the example in the previous section
  • The sample_input_file parameter is removed, because in the deployment of ARM-based models, Modzy will not run a sample inference test
  • The input_details and output_details parameters are included (and highly recommended). Including these parameters will populate the API section of your model details page with the correct information users can copy-paste to run this model. Note: If you built your model to expect a specific input filename and return a specific output filename, you must make sure your input and output details match what the model expects. Otherwise, there will be a discrepancy between the code snippets in the API section and what your model actually expects

After running this code, you will notice the model will publish quickly. At this point, you can deploy it to a device group. Learn more about deploying models to edge devices here.

Modzy UI Method

If you choose to deploy your arm64 or arm model via Modzy's UI, follow the same beginning steps outlined in the Import Container until the hardware selection step. At this stage, simply select the first option:

Figure 1. Hardware option for ARM models

Figure 1. Hardware option for ARM models

📘

Note: The 0 cores and 0Gi RAM indicationsdo not reflect the true amount of resources you are selecting. The purpose of this deployment option is to give users a way to deploy models to one or many edge devices, all of which may have different resource constraints. It is up to the model developer to ensure the model container size and inference RAM requirements remain within the limits of the target edge hardware.

After your model container loads, Modzy will skip the remaining deployment steps and allow you to Publish your model. At this point, the default input and output information will appear on the API page. Leverage a separate utility function in Modzy's Python SDK to edit this metadata. Example code snippet provided below.

model_info = {
    "technicalDetails": {
        "inputs": [
            {
                "name": "input",
                "acceptedMediaTypes": "text/plain",
                "maximumSize": 5000000,
                "description": "Input description"
            }
        ],
        "outputs": [
            {
                "name": "results.json",
                "mediaType": "application/json",
                "maximumSize": 1000000,
                "description": "Output description"
            }
        ],    
    }
}

model_data = client.models.edit_model_metadata(
    model_id="kp3qcrq0eel",
    model_version="0.0.1",
    long_description="This model returns the same text passed through as input, similar to an 'echo.'",
    technical_details="This section can include any technical information abot your model. Include information about how your model was trained, any underlying architecture details, or other pertinant information an end-user would benefit from learning.",
    performance_summary="This is the performance summary.",
    input_details=model_info["technicalDetails"]["inputs"],
    output_details=model_info["technicalDetails"]["outputs"], 
)

Congratulations! In just minutes, you programmatically deployed your new model container to Modzy. To take this building block and build a fully automated CI/CD pipeline, check out this Github Actions example.