Self-Service Tutorial Contents
In this tutorial, we will continue our end-to-end Modzy journey by sending the model we've been working with to an edge device, running that model at the edge, and interacting with it using Modzy's Python SDK and Inference API.
What you'll need for this tutorial
- The same model we've been using up until this point
x86edge device with Linux and Docker installed (we tested this tutorial on an UP Board but check out our minimum requirements to see if your device will run Modzy Core)
- A valid Modzy account
- Your local Python environment with Modzy's SDK installed (Python >= 3.6)
The first thing we will do is verify our device is properly set up and ready to host your model. The goal here is to ensure we prevent any avoidable errors.
Log into your device (reminder, it must be running a Linux OS) via SSH or other preferred method to follow the below verification instructions.
If you have not already installed Docker on your device, follow the appropriate instructions. Once installed, verify Docker is configured correctly by typing the following in your terminal:
$ docker run hello-world
If your output looks similar to the following, you are good to go!
Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (arm64v8) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/
A common error you might experience will include "permission denied." If this happens, you likely have not configured Docker to be managed as a non-root user (i.e., you can only run Docker with
sudo). Follow these post-installation instructions to do so.
Double checking Docker to verify it can run a non-root user will save you many headaches when it comes time to run Modzy core.
In your edge device terminal, type the following command:
Your output should look similar to this:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 36 bits physical, 48 bits virtual CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 76 Model name: Intel(R) Atom(TM) x5-Z8350 CPU @ 1.44GHz Stepping: 4 CPU MHz: 479.997 CPU max MHz: 1920.0000 CPU min MHz: 480.0000 BogoMIPS: 2880.00 Virtualization: VT-x L1d cache: 96 KiB L1i cache: 128 KiB L2 cache: 2 MiB NUMA node0 CPU(s): 0-3 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Mitigation; Clear CPU buffers; SMT disabled Vulnerability Meltdown: Mitigation; PTI Vulnerability Mmio stale data: Not affected Vulnerability Spec store bypass: Not affected Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch epb pti ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida arat md_clear
The line we are interested in is the
Architecture line at the top. Verify that this reads
x86_64. If so, you're good to go!
ARM Chip Caveat
This tutorial assumes your device has an x86/amd chipset. If your device architecture is a variant of the ARM chipset family, you can still deploy a model and Modzy edge to the device but will need to refer to our tutorial for ARM devices.
Common ARM variants you might see:
aarch64, arm64v8, arm32v7, arm32v6, armv6l
As we proceed with this tutorial, stay logged into your edge device!
To deploy your model to an edge device, you must first create a Device Group. Back in your Modzy instance, navigate to the the "Operations" tab and select the "Edge Devices" page. From here, click on the "New Device Group" button in the top right corner of the screen.
Next, you'll need to provide the following information:
- Fill out the Device group name field, calling it something like "Edge Deployment Tutorial Group"
- Click on the Add model button and select the
Hugging Face TinyBERT (AMD)model we have been working with up until this point (refer to the Deploy Model tutorial if you have not yet deployed this model)
- Use the default values for all other fields.
With this configuration set, click "Create New Device Group" to continue, which will take you back to the Edge Devices page.
Now, you will need to create a device token, which will be used to securely register your edge device with Modzy. Select your newly created device group and click the "Create Device Token" button found in the top right corner of the screen.
You will next be prompted to set two fields:
- Set the Expiration time to
1hour if you plan to complete this tutorial immediately. (After 1 hour, this token will expire and you'll need to create a new token.)
- Set the Number of uses to
10. The token will no longer work after it's been used a set number of times, so setting this value to 10 allows for debugging without the token expiring too quickly.
Once you've filled in these values, click the "Create Device Token" button.
Next, you'll generate the commands needed to install and run Modzy core on your edge device.
With the "Using wget" tab selected, set "Device type" to "Linux with 64-bit Intel/AMD chip".
Then click "Generate Commands". The following will show up on your screen:
Copy these three commands into a notepad or text editor so you can run them in a terminal on your edge device.
Act quickly! wget URLs are time-sensitive
The wget download URL expires after 10 minutes, so make sure to move onto the next steps right away.
Switch over to your edge device, navigate to the folder where you'd like to install Modzy Core and in your terminal, run the first command you copied in the previous step. This command will look something like this:
wget 'https://<yourmodzydomain.app.modzy.com>/api/downloads/modzy-core-linux-amd64?token=s.bqCRPEc1BpAIVdBDxKzrz0zF' -O modzy-core
In your edge device terminal, you should see the binary begin to download. After the download completes, run the next command:
chmod +x modzy-core
Finally, we will slightly modify the third command before running it and increase the model timeout (add
--model.timeout 30). This is required because this model takes longer than 10 seconds (default value) to load with Modzy core.
Your command will look something like this:
./modzy-core server --model.timeout 30 --modzy.url https://<yourmodzydomain.app.modzy.com> DEVICE_REGISTRATION_TOKEN
In the example above make sure to replace
DEVICE_REGISTRTION_TOKEN with a valid, unexpired device token (it will look something like
After running these commands you should see logs beginning to populate your terminal. These logs will let you see the progress of your model container being downloaded to the device. Once the process is complete, you should see a set of logs that look similar to the following:
2023-07-07T20:48:54.784Z INFO logger/logger.go:108 acquiring lock 2023-07-07T20:48:54.786Z INFO logger/logger.go:108 connecting to Modzy... 2023-07-07T20:48:54.787Z INFO logger/logger.go:108 device token provided, proceeding with registration via http 2023-07-07T20:48:55.496Z INFO nats/server.go:31 starting internal NATS server... 2023-07-07T20:48:55.723Z INFO cmd/server.go:188 migrating database 2023-07-07T20:48:55.762Z INFO logger/logger.go:108 Configuring Core |2023-07-07T20:48:55.815Z INFO logger/logger.go:108 Modzy Core server is starting... 2023-07-07T20:48:55.818Z INFO apiserver/server.go:232 Server is starting... 2023-07-07T20:48:55.820Z INFO apiserver/server.go:274 Server is listening at :55000 ✓ Reconciling Models
This can take several minutes for your model to download, but once you see " Reconciling Models", then you'll know that your model is running live and ready to make prediction via both REST and gRPC requests!
Once Modzy Core is installed on your device and you ran the third command from above, you can view that device within Modzy. Return to Modzy and select your device group. Then click on the "Devices" tab. The device you just connected should appear. Click on your device.
The page that appears will provide information about your device, including information you'll need in the next step, including your device's IP address, and the ID and version number of the
Hugging Face TinyBERT (AMD) model deployed to your device.
Now that the model is up and running, you can run your first inference against the model. At this point, it is accessible via REST or gRPC APIs. Follow a few simple steps to run an inference using Modzy Core's Inference API
- Create and activate a python environment on your local machine
- Pip install the Modzy's Python SDK in your Python environment:
pip install modzy-sdk>=11.3.0
- Make sure you (1) know the IP address of your edge device, and (2) can access that IP address from your Python environment
- Know the model identifier and version number of the TinyBERT model deployed to your Modzy instance.
- Copy the sample inference script below and save it to your local Python environment as
- Replace the
DEVICE_IP_ADDRESSvariables with your model's identifier and version in Modzy and your edge device's IP address (lines 7-9).
- Run the following command from your terminal
import json import modzy from modzy.edge import InputSource from modzy.edge.proto.inferences.api.v1.inferences_pb2 import Inference MODEL_ID = "<insert-model-id>" MODEL_VERSION = "<insert-model-version>" DEVICE_IP_ADDRESS = "<insert-device-ip-address>" client = modzy.EdgeClient(DEVICE_IP_ADDRESS, 55000) client.connect() text_bytes = "Modzy Edge is super fun!".encode() input_obj = InputSource( key="input.txt", data=text_bytes, ) inference: Inference = client.inferences.run(MODEL_ID, MODEL_VERSION, [input_obj]) try: result = inference.result outputs = result.outputs output = outputs["results.json"] output_bytes = output.data obj = json.loads(output_bytes) obj_formatted = json.dumps(obj, indent=4) print(obj_formatted) except Exception as e: print(e) client.close()
Now that you have successfully deployed, spun up, and run inference against your model on your edge device, you can stop this model at any time by simply killing the Modzy Core process you spun up with the
./modzy-core server command.
To do so, simply navigate to this terminal and type
Ctrl + c on your keyboard. This should cleanly kill all Modzy Core processes running, which in turn removes all containers. Run the following command to verify your model container(s) have been removed.
docker ps -a
If this command results in an empty list of containers, you are good to go! If not, clean up any remaining containers by copying the ID of the container(s) and typing the following in your terminal.
docker rm -f <insert-container-id-1> <insert-container-id-2> <insert-container-id-...>
And whenever you are ready to spin up your model and submit more inferences, you can simply resume the process with the following command:
./modzy-core server --resume
Updated 5 months ago