Modzy v1.6 introduces significant performance improvements to Modzy Edge. These improvements make it possible to run powerful models, quickly and efficiently on small single board computers such as the Raspberry Pi or the NVIDIA Jetson Nano.
inferenceAPI added to Modzy Edge
- 15x faster inference times thanks to a new caching layer and the removal of expensive data writes (initial benchmarks show 300 inferences/sec on a TinyBert model)
- Provides both HTTP/REST and gRPC interfaces
- Allows for custom tags to be included with each inference
- Fully supports bi-directional streaming
- New python library for interacting with the
inferenceAPI via gRPC
- New direct mode added to Modzy Core provides even higher inference speeds by eliminating local storage and only returning results through gRPC messages
- Added support for running inferences against data stored in AWS S3, Azure Blob Storage, and NetApp StorageGRID and other S3-compliant storage providers to Modzy Core
- Improved CLI experience for Modzy Core, including more informative error messages and a new
- Multiple microservices have been rewritten in Golang, reducing Modzy's installation footprint and resource consumption
- Fixed a bug that wasn't correctly capturing hardware stats from edge devices, such as number of cores, RAM, etc.
- Modzy core's queue now saves to the local edge filesystem, instead of in memory, which improves reliability in the event that Modzy Core spontaneously shuts down