Container specifications REST
Modzy requests the model’s status, runs data, and gets output results from a model container. When needed, it can request to shutdown a container. The container specifications is an API that can respond to these requests. Here we provide a sample specification that uses REST over HTTP/1.1 with JSON to send data, for your reference.
We currently support Docker containers. Support for all OCI-compliant containers is coming soon.
Requirements
Container
The Docker container must expose an HTTP API on the port specified by the PSC_MODEL_PORT
environment variable that implements the /status
, /run
, and /shutdown
routes detailed below.
Entry Point
The container entry point is responsible for launching the HTTP server, e.g. when run as:
docker run image
Use the exec syntax in your Dockerfile to define the entry point:
COPY entrypoint.sh ./
ENTRYPOINT ["./entrypoint.sh"]
Inputs and outputs
Inputs hold the input-items sent to the model to be processed. Outputs hold the result files returned by the model. Specify the input and output item names to link the model to input-items and results.
Each model defines a filename for the input and output items. The filenames of the input-items sent must match the model’s input-item names.
For example, the Multi-language OCR model defines its inputs and outputs as follows:
- the input-item has two data-items:
image
andconfig.json
, - the output is named
results.json
.
In this case, the path to the input directory contains:
ls /path/to/input/directory
image config.json
And when the results are available, the output directory contains:
ls /path/to/output/directory
results.json
Check out our run a model tutorial for more details.
HTTP API specifications
The response DTO
All routes should respond with an application/json
type and with this format:
{
"message": "The call went well or terribly.",
"status": "OK",
"statusCode": 200
}
Ensure the message provides useful feedback about model errors.
[Get /status]
Initializes the model and returns its status.
Response
{
"message": "The call went well.",
"status": "Ok",
"statusCode": 200
}
Status codes | |
---|---|
Status 200 | The model is ready to run. |
Status 500 | Unexpected error loading the model. |
Batch processing
To set a model for batch processing, the /status
route should return a batch processing size. A model’s batch_size
is the maximum amount of input items it can process simultaneously while mounted on a GPU.
{
"batch_size": 64,
"message": "Model Ready to receive 64 inputs in run route.",
"status": "Ok",
"statusCode": 200
}
[Post /run]
Runs the model inference on a given input.
Request
Add the job configuration object (use application/json
):
{
"type": "file",
"explain": true,
"input": "/path/to/input/directory",
"output": "/path/to/output/directory"
}
Parameters | |
---|---|
| The input’s and output’s types. It can be |
| Sets the explainability feature when a model offers the option. |
| The filesystem directory path from which the model should read input items files. |
| The filesystem directory path where the model should write output data files. |
Batch processing
If the model supports batch processing, the run call requires “type”: “batch”
and changes input
for inputs
and output
for outputs
:
{
"type": "batch",
"inputs": ["/path/to/input/directory", "/path/to/other/input/directory"],
"outputs": ["/path/to/output/directory", "/path/to/other/output/directory"]
}
Response
{
"message": "Success with errors.",
"status": "OK",
"statusCode": 200
}
Status codes | |
---|---|
Status 200 | Successful inference. |
Status 400 | Invalid job configuration object. |
Status 415 | Invalid media type. |
Status 422 | Unprocessable input file. |
Status 500 | Unexpected error running the model. |
{
"errors": [
{
"1": "Error in second input"
}
],
"message": "Success with errors.",
"status": "OK",
"statusCode": 200
}
Explainability
Output files contain inference results. Models with built-in explainability return output files with this structure:
{
"modelType": "",
"result": {
"classPredictions": []
},
"explanation": {
"maskRLE": []
}
}
{
"modelType": "textClassification",
"result": {
"classPredictions": []
},
"explanation": {
"wordImportances": {},
"explainableText": {}
}
}
Image classification models explainability object
Parameter | Type | Description |
---|---|---|
modelType | string | Defines the explanation format. Possible options: |
result | object | Contains the results in a |
explanation | object | Contains a maskRLE array with the explanation and a dimensions object with the height and width pixels. The |
Text classification models explainability object
Parameter | Type | Description |
---|---|---|
modelType | string | Defines the explanation format. Possible Options: |
result | object | Contains the results in a |
explanation | object | Contains:
optional |
[ Post /shutdown]
The model server process should exit with exit code 0.
Response
The model server is not required to send a response and may simply drop the connection; however, a response is encouraged.
Status codes | |
---|---|
Status 202 | Request accepted. |
Status 500 | Unexpected error shutting down the model. |
Updated 9 months ago