GuidesRecipesAPI ReferenceChangelogDiscussions
Log In

6. Deploy model to edge device


Self-Service Tutorial Contents

  1. Package Model
  2. Deploy Model
  3. Scale Model Up
  4. Run Model Inference
  5. Set Drift Baseline
  6. :arrow-right: Deploy Model to Edge Device

Prepare to deploy your model

In this tutorial, we will continue our end-to-end Modzy journey by sending the model we've been working with to an edge device, running that model at the edge, and interacting with it using Modzy's Python SDK and Inference API.


What you'll need for this tutorial

  • The same model we've been using up until this point
  • An x86 edge device with Linux and Docker installed (we tested this tutorial on an UP Board but check out our minimum requirements to see if your device will run Modzy Core)
  • A valid Modzy account
  • Your local Python environment with Modzy's SDK installed (Python >= 3.6)

Verify device setup

The first thing we will do is verify our device is properly set up and ready to host your model. The goal here is to ensure we prevent any avoidable errors.

Log into your device (reminder, it must be running a Linux OS) via SSH or other preferred method to follow the below verification instructions.

Confirm that Docker is installed and running

If you have not already installed Docker on your device, follow the appropriate instructions. Once installed, verify Docker is configured correctly by typing the following in your terminal:

$ docker run hello-world

If your output looks similar to the following, you are good to go!

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:

For more examples and ideas, visit:

A common error you might experience will include "permission denied." If this happens, you likely have not configured Docker to be managed as a non-root user (i.e., you can only run Docker with sudo). Follow these post-installation instructions to do so.

Double checking Docker to verify it can run a non-root user will save you many headaches when it comes time to run Modzy core.

Confirm that your devices uses an Intel or AMD CPU

In your edge device terminal, type the following command:

$ lscpu

Your output should look similar to this:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   36 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           76
Model name:                      Intel(R) Atom(TM) x5-Z8350  CPU @ 1.44GHz
Stepping:                        4
CPU MHz:                         479.997
CPU max MHz:                     1920.0000
CPU min MHz:                     480.0000
BogoMIPS:                        2880.00
Virtualization:                  VT-x
L1d cache:                       96 KiB
L1i cache:                       128 KiB
L2 cache:                        2 MiB
NUMA node0 CPU(s):               0-3
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts
                                 rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt
                                  tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch epb pti ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida arat md_clear

The line we are interested in is the Architecture line at the top. Verify that this reads amd64 or x86_64. If so, you're good to go!


ARM Chip Caveat

This tutorial assumes your device has an x86/amd chipset. If your device architecture is a variant of the ARM chipset family, you can still deploy a model and Modzy edge to the device but will need to refer to our tutorial for ARM devices.

Common ARM variants you might see: aarch64, arm64v8, arm32v7, arm32v6, armv6l

Create a device group


As we proceed with this tutorial, stay logged into your edge device!

To deploy your model to an edge device, you must first create a Device Group. Back in your Modzy instance, navigate to the the "Operations" tab and select the "Edge Devices" page. From here, click on the "New Device Group" button in the top right corner of the screen.

Edge Devices Page

Edge Devices Page

Next, you'll need to provide the following information:

  • Fill out the Device group name field, calling it something like "Edge Deployment Tutorial Group"
  • Click on the Add model button and select the Hugging Face TinyBERT (AMD) model we have been working with up until this point (refer to the Deploy Model tutorial if you have not yet deployed this model)
  • Use the default values for all other fields.
Device Group Configuration

Device Group Configuration

With this configuration set, click "Create New Device Group" to continue, which will take you back to the Edge Devices page.

New Device Group Successfully Created

New Device Group Successfully Created

Create a token for your device group

Now, you will need to create a device token, which will be used to securely register your edge device with Modzy. Select your newly created device group and click the "Create Device Token" button found in the top right corner of the screen.

Device Group Configuration Page

Device Group Configuration Page

You will next be prompted to set two fields:

  • Set the Expiration time to 1 hour if you plan to complete this tutorial immediately. (After 1 hour, this token will expire and you'll need to create a new token.)
  • Set the Number of uses to 10. The token will no longer work after it's been used a set number of times, so setting this value to 10 allows for debugging without the token expiring too quickly.
Device Registration Creation Page

Device Registration Creation Page

Once you've filled in these values, click the "Create Device Token" button.

Next, you'll generate the commands needed to install and run Modzy core on your edge device.

With the "Using wget" tab selected, set "Device type" to "Linux with 64-bit Intel/AMD chip".

Device Type Selection

Device Type Selection

Then click "Generate Commands". The following will show up on your screen:

Device Registration Token Commands

Device Registration Token Commands

Copy these three commands into a notepad or text editor so you can run them in a terminal on your edge device.


Act quickly! wget URLs are time-sensitive

The wget download URL expires after 10 minutes, so make sure to move onto the next steps right away.

Install Modzy Core on your edge device

Switch over to your edge device, navigate to the folder where you'd like to install Modzy Core and in your terminal, run the first command you copied in the previous step. This command will look something like this:

wget 'https://<>/api/downloads/modzy-core-linux-amd64?token=s.bqCRPEc1BpAIVdBDxKzrz0zF' -O modzy-core

In your edge device terminal, you should see the binary begin to download. After the download completes, run the next command:

chmod +x modzy-core

Finally, we will slightly modify the third command before running it and increase the model timeout (add --model.timeout 30). This is required because this model takes longer than 10 seconds (default value) to load with Modzy core.

Your command will look something like this:

./modzy-core server --model.timeout 30 --modzy.url https://<> DEVICE_REGISTRATION_TOKEN

In the example above make sure to replace DEVICE_REGISTRTION_TOKEN with a valid, unexpired device token (it will look something like s.mJdlDcCOxE7XyZSlUwTtDvCd)

After running these commands you should see logs beginning to populate your terminal. These logs will let you see the progress of your model container being downloaded to the device. Once the process is complete, you should see a set of logs that look similar to the following:

2023-07-07T20:48:54.784Z	INFO	logger/logger.go:108	acquiring lock
2023-07-07T20:48:54.786Z	INFO	logger/logger.go:108	connecting to Modzy...
2023-07-07T20:48:54.787Z	INFO	logger/logger.go:108	device token provided, proceeding with registration via http
2023-07-07T20:48:55.496Z	INFO	nats/server.go:31	starting internal NATS server...
2023-07-07T20:48:55.723Z	INFO	cmd/server.go:188	migrating database
2023-07-07T20:48:55.762Z	INFO	logger/logger.go:108	Configuring Core
|2023-07-07T20:48:55.815Z	INFO	logger/logger.go:108	Modzy Core server is starting...
2023-07-07T20:48:55.818Z	INFO	apiserver/server.go:232	Server is starting...
2023-07-07T20:48:55.820Z	INFO	apiserver/server.go:274	Server is listening at :55000
✓ Reconciling Models 

This can take several minutes for your model to download, but once you see ":white-check-mark: Reconciling Models", then you'll know that your model is running live and ready to make prediction via both REST and gRPC requests!

View your device on the Modzy app

Once Modzy Core is installed on your device and you ran the third command from above, you can view that device within Modzy. Return to Modzy and select your device group. Then click on the "Devices" tab. The device you just connected should appear. Click on your device.

The page that appears will provide information about your device, including information you'll need in the next step, including your device's IP address, and the ID and version number of the Hugging Face TinyBERT (AMD) model deployed to your device.

Device detail page

Device detail page

Run an inference against your model

Now that the model is up and running, you can run your first inference against the model. At this point, it is accessible via REST or gRPC APIs. Follow a few simple steps to run an inference using Modzy Core's Inference API

  • Create and activate a python environment on your local machine
  • Pip install the Modzy's Python SDK in your Python environment: pip install modzy-sdk>=11.3.0
  • Make sure you (1) know the IP address of your edge device, and (2) can access that IP address from your Python environment
  • Know the model identifier and version number of the TinyBERT model deployed to your Modzy instance.
  • Copy the sample inference script below and save it to your local Python environment as
  • Replace the MODEL_ID, MODEL_VERSION, and DEVICE_IP_ADDRESS variables with your model's identifier and version in Modzy and your edge device's IP address (lines 7-9).
  • Run the following command from your terminal python3
import json

import modzy
from modzy.edge import InputSource
from modzy.edge.proto.inferences.api.v1.inferences_pb2 import Inference

MODEL_ID = "<insert-model-id>"
MODEL_VERSION = "<insert-model-version>"
DEVICE_IP_ADDRESS = "<insert-device-ip-address>"

client = modzy.EdgeClient(DEVICE_IP_ADDRESS, 55000)

text_bytes = "Modzy Edge is super fun!".encode()
input_obj = InputSource(

inference: Inference =, MODEL_VERSION, [input_obj])
    result = inference.result
    outputs = result.outputs
    output = outputs["results.json"]
    output_bytes =
    obj = json.loads(output_bytes)
    obj_formatted = json.dumps(obj, indent=4)
except Exception as e:

Stop model

Now that you have successfully deployed, spun up, and run inference against your model on your edge device, you can stop this model at any time by simply killing the Modzy Core process you spun up with the ./modzy-core server command.

To do so, simply navigate to this terminal and type Ctrl + c on your keyboard. This should cleanly kill all Modzy Core processes running, which in turn removes all containers. Run the following command to verify your model container(s) have been removed.

docker ps -a

If this command results in an empty list of containers, you are good to go! If not, clean up any remaining containers by copying the ID of the container(s) and typing the following in your terminal.

docker rm -f <insert-container-id-1> <insert-container-id-2> <insert-container-id-...>

Restart model

And whenever you are ready to spin up your model and submit more inferences, you can simply resume the process with the following command:

./modzy-core server --resume