6. Deploy model to edge device
Self-Service Tutorial Contents
Prepare to deploy your model
In this tutorial, we will continue our end-to-end Modzy journey by sending the model we've been working with to an edge device, running that model at the edge, and interacting with it using Modzy's Python SDK and Inference API.
What you'll need for this tutorial
- The same model we've been using up until this point
- An
x86
edge device with Linux and Docker installed (we tested this tutorial on an UP Board but check out our minimum requirements to see if your device will run Modzy Core)- A valid Modzy account
- Your local Python environment with Modzy's SDK installed (Python >= 3.6)
Verify device setup
The first thing we will do is verify our device is properly set up and ready to host your model. The goal here is to ensure we prevent any avoidable errors.
Log into your device (reminder, it must be running a Linux OS) via SSH or other preferred method to follow the below verification instructions.
Confirm that Docker is installed and running
If you have not already installed Docker on your device, follow the appropriate instructions. Once installed, verify Docker is configured correctly by typing the following in your terminal:
$ docker run hello-world
If your output looks similar to the following, you are good to go!
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(arm64v8)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
A common error you might experience will include "permission denied." If this happens, you likely have not configured Docker to be managed as a non-root user (i.e., you can only run Docker with sudo
). Follow these post-installation instructions to do so.
Double checking Docker to verify it can run a non-root user will save you many headaches when it comes time to run Modzy core.
Confirm that your devices uses an Intel or AMD CPU
In your edge device terminal, type the following command:
$ lscpu
Your output should look similar to this:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 36 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 76
Model name: Intel(R) Atom(TM) x5-Z8350 CPU @ 1.44GHz
Stepping: 4
CPU MHz: 479.997
CPU max MHz: 1920.0000
CPU min MHz: 480.0000
BogoMIPS: 2880.00
Virtualization: VT-x
L1d cache: 96 KiB
L1i cache: 128 KiB
L2 cache: 2 MiB
NUMA node0 CPU(s): 0-3
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts
rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt
tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch epb pti ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida arat md_clear
The line we are interested in is the Architecture
line at the top. Verify that this reads amd64
or x86_64
. If so, you're good to go!
ARM Chip Caveat
This tutorial assumes your device has an x86/amd chipset. If your device architecture is a variant of the ARM chipset family, you can still deploy a model and Modzy edge to the device but will need to refer to our tutorial for ARM devices.
Common ARM variants you might see:
aarch64, arm64v8, arm32v7, arm32v6, armv6l
Create a device group
As we proceed with this tutorial, stay logged into your edge device!
To deploy your model to an edge device, you must first create a Device Group. Back in your Modzy instance, navigate to the the "Operations" tab and select the "Edge Devices" page. From here, click on the "New Device Group" button in the top right corner of the screen.

Edge Devices Page
Next, you'll need to provide the following information:
- Fill out the Device group name field, calling it something like "Edge Deployment Tutorial Group"
- Click on the Add model button and select the
Hugging Face TinyBERT (AMD)
model we have been working with up until this point (refer to the Deploy Model tutorial if you have not yet deployed this model) - Use the default values for all other fields.

Device Group Configuration
With this configuration set, click "Create New Device Group" to continue, which will take you back to the Edge Devices page.

New Device Group Successfully Created
Create a token for your device group
Now, you will need to create a device token, which will be used to securely register your edge device with Modzy. Select your newly created device group and click the "Create Device Token" button found in the top right corner of the screen.

Device Group Configuration Page
You will next be prompted to set two fields:
- Set the Expiration time to
1
hour if you plan to complete this tutorial immediately. (After 1 hour, this token will expire and you'll need to create a new token.) - Set the Number of uses to
10
. The token will no longer work after it's been used a set number of times, so setting this value to 10 allows for debugging without the token expiring too quickly.

Device Registration Creation Page
Once you've filled in these values, click the "Create Device Token" button.
Next, you'll generate the commands needed to install and run Modzy core on your edge device.
With the "Using wget" tab selected, set "Device type" to "Linux with 64-bit Intel/AMD chip".

Device Type Selection
Then click "Generate Commands". The following will show up on your screen:

Device Registration Token Commands
Copy these three commands into a notepad or text editor so you can run them in a terminal on your edge device.
Act quickly! wget URLs are time-sensitive
The wget download URL expires after 10 minutes, so make sure to move onto the next steps right away.
Install Modzy Core on your edge device
Switch over to your edge device, navigate to the folder where you'd like to install Modzy Core and in your terminal, run the first command you copied in the previous step. This command will look something like this:
wget 'https://<yourmodzydomain.app.modzy.com>/api/downloads/modzy-core-linux-amd64?token=s.bqCRPEc1BpAIVdBDxKzrz0zF' -O modzy-core
In your edge device terminal, you should see the binary begin to download. After the download completes, run the next command:
chmod +x modzy-core
Finally, we will slightly modify the third command before running it and increase the model timeout (add --model.timeout 30
). This is required because this model takes longer than 10 seconds (default value) to load with Modzy core.
Your command will look something like this:
./modzy-core server --model.timeout 30 --modzy.url https://<yourmodzydomain.app.modzy.com> DEVICE_REGISTRATION_TOKEN
In the example above make sure to replace DEVICE_REGISTRTION_TOKEN
with a valid, unexpired device token (it will look something like s.mJdlDcCOxE7XyZSlUwTtDvCd
)
After running these commands you should see logs beginning to populate your terminal. These logs will let you see the progress of your model container being downloaded to the device. Once the process is complete, you should see a set of logs that look similar to the following:
2023-07-07T20:48:54.784Z INFO logger/logger.go:108 acquiring lock
2023-07-07T20:48:54.786Z INFO logger/logger.go:108 connecting to Modzy...
2023-07-07T20:48:54.787Z INFO logger/logger.go:108 device token provided, proceeding with registration via http
2023-07-07T20:48:55.496Z INFO nats/server.go:31 starting internal NATS server...
2023-07-07T20:48:55.723Z INFO cmd/server.go:188 migrating database
2023-07-07T20:48:55.762Z INFO logger/logger.go:108 Configuring Core
|2023-07-07T20:48:55.815Z INFO logger/logger.go:108 Modzy Core server is starting...
2023-07-07T20:48:55.818Z INFO apiserver/server.go:232 Server is starting...
2023-07-07T20:48:55.820Z INFO apiserver/server.go:274 Server is listening at :55000
✓ Reconciling Models
This can take several minutes for your model to download, but once you see " Reconciling Models", then you'll know that your model is running live and ready to make prediction via both REST and gRPC requests!
View your device on the Modzy app
Once Modzy Core is installed on your device and you ran the third command from above, you can view that device within Modzy. Return to Modzy and select your device group. Then click on the "Devices" tab. The device you just connected should appear. Click on your device.
The page that appears will provide information about your device, including information you'll need in the next step, including your device's IP address, and the ID and version number of the Hugging Face TinyBERT (AMD)
model deployed to your device.

Device detail page
Run an inference against your model
Now that the model is up and running, you can run your first inference against the model. At this point, it is accessible via REST or gRPC APIs. Follow a few simple steps to run an inference using Modzy Core's Inference API
- Create and activate a python environment on your local machine
- Pip install the Modzy's Python SDK in your Python environment:
pip install modzy-sdk>=11.3.0
- Make sure you (1) know the IP address of your edge device, and (2) can access that IP address from your Python environment
- Know the model identifier and version number of the TinyBERT model deployed to your Modzy instance.
- Copy the sample inference script below and save it to your local Python environment as
edge-inference.py
- Replace the
MODEL_ID
,MODEL_VERSION
, andDEVICE_IP_ADDRESS
variables with your model's identifier and version in Modzy and your edge device's IP address (lines 7-9). - Run the following command from your terminal
python3 edge-inference.py
import json
import modzy
from modzy.edge import InputSource
from modzy.edge.proto.inferences.api.v1.inferences_pb2 import Inference
MODEL_ID = "<insert-model-id>"
MODEL_VERSION = "<insert-model-version>"
DEVICE_IP_ADDRESS = "<insert-device-ip-address>"
client = modzy.EdgeClient(DEVICE_IP_ADDRESS, 55000)
client.connect()
text_bytes = "Modzy Edge is super fun!".encode()
input_obj = InputSource(
key="input.txt",
data=text_bytes,
)
inference: Inference = client.inferences.run(MODEL_ID, MODEL_VERSION, [input_obj])
try:
result = inference.result
outputs = result.outputs
output = outputs["results.json"]
output_bytes = output.data
obj = json.loads(output_bytes)
obj_formatted = json.dumps(obj, indent=4)
print(obj_formatted)
except Exception as e:
print(e)
client.close()
Stop model
Now that you have successfully deployed, spun up, and run inference against your model on your edge device, you can stop this model at any time by simply killing the Modzy Core process you spun up with the ./modzy-core server
command.
To do so, simply navigate to this terminal and type Ctrl + c
on your keyboard. This should cleanly kill all Modzy Core processes running, which in turn removes all containers. Run the following command to verify your model container(s) have been removed.
docker ps -a
If this command results in an empty list of containers, you are good to go! If not, clean up any remaining containers by copying the ID of the container(s) and typing the following in your terminal.
docker rm -f <insert-container-id-1> <insert-container-id-2> <insert-container-id-...>
Restart model
And whenever you are ready to spin up your model and submit more inferences, you can simply resume the process with the following command:
./modzy-core server --resume
Updated 5 months ago