v1 Container Specifications REST
[Deprecated] Modzy v1 Container Spec
Modzy requests the model’s status, runs data, and gets output results from a model container. When needed, it can request to shutdown a container. The container specifications is an API that can respond to these requests. Here we provide a sample specification that uses REST over HTTP/1.1 with JSON to send data, for your reference.
We currently support Docker containers. Support for all OCI-compliant containers is coming soon.
Requirements
Container
The Docker container must expose an HTTP API on the port specified by the PSC_MODEL_PORT
environment variable that implements the /status
, /run
, and /shutdown
routes detailed below.
Entry Point
The container entry point is responsible for launching the HTTP server, e.g. when run as:
docker run image
Use the exec syntax in your Dockerfile to define the entry point:
COPY entrypoint.sh ./
ENTRYPOINT ["./entrypoint.sh"]
Inputs and outputs
Inputs hold the input-items sent to the model to be processed. Outputs hold the result files returned by the model. Specify the input and output item names to link the model to input-items and results.
Each model defines a filename for the input and output items. The filenames of the input-items sent must match the model’s input-item names.
For example, the Multi-language OCR model defines its inputs and outputs as follows:
- the input-item has two data-items:
image
andconfig.json
, - the output is named
results.json
.
In this case, the path to the input directory contains:
ls /path/to/input/directory
image config.json
And when the results are available, the output directory contains:
ls /path/to/output/directory
results.json
Check out our run a model tutorial for more details.
HTTP API specifications
The response DTO
All routes should respond with an application/json
type and with this format:
{
"message": "The call went well or terribly.",
"status": "OK",
"statusCode": 200
}
Ensure the message provides useful feedback about model errors.
[Get /status]
Initializes the model and returns its status.
Response
{
"message": "The call went well.",
"status": "Ok",
"statusCode": 200
}
Status codes | |
---|---|
Status 200 | The model is ready to run. |
Status 500 | Unexpected error loading the model. |
Batch processing
To set a model for batch processing, the /status
route should return a batch processing size. A model’s batch_size
is the maximum amount of input items it can process simultaneously while mounted on a GPU.
{
"batch_size": 64,
"message": "Model Ready to receive 64 inputs in run route.",
"status": "Ok",
"statusCode": 200
}
[Post /run]
Runs the model inference on a given input.
Request
Add the job configuration object (use application/json
):
{
"type": "file",
"explain": true,
"input": "/path/to/input/directory",
"output": "/path/to/output/directory"
}
Parameters | |
---|---|
type | The input’s and output’s types. It can be file or batch . |
explain optional | Sets the explainability feature when a model offers the option. |
input | The filesystem directory path from which the model should read input items files. |
output | The filesystem directory path where the model should write output data files. |
Batch processing
If the model supports batch processing, the run call requires “type”: “batch”
and changes input
for inputs
and output
for outputs
:
{
"type": "batch",
"inputs": ["/path/to/input/directory", "/path/to/other/input/directory"],
"outputs": ["/path/to/output/directory", "/path/to/other/output/directory"]
}
Response
{
"message": "Success with errors.",
"status": "OK",
"statusCode": 200
}
Status codes | |
---|---|
Status 200 | Successful inference. |
Status 400 | Invalid job configuration object. The job configuration object is malformed or the expected files do not exist or cannot be read or written. When running on the platform this should never occur but may be useful for debugging. |
Status 415 | Invalid media type. The client did not post application/json in the HTTP body.When running on the platform this should never occur but may be useful for debugging. |
Status 422 | Unprocessable input file. The model cannot run inference on the provided input files (for example an input file may be the wrong format, too large, too small, etc). The response message should contain a detailed validation error that explains why the model cannot process a given input file. |
Status 500 | Unexpected error running the model. |
{
"errors": [
{
"1": "Error in second input"
}
],
"message": "Success with errors.",
"status": "OK",
"statusCode": 200
}
Explainability
Output files contain inference results. Models with built-in explainability return output files with this structure:
{
"modelType": "",
"result": {
"classPredictions": []
},
"explanation": {
"maskRLE": []
}
}
{
"modelType": "textClassification",
"result": {
"classPredictions": []
},
"explanation": {
"wordImportances": {},
"explainableText": {}
}
}
Image classification models explainability object
Parameter | Type | Description |
---|---|---|
modelType | string | Defines the explanation format. Possible options: imageClassification , imageSegmentation , objectDetection . |
result | object | Contains the results in a classPredictions array. |
explanation | object | Contains a maskRLE array with the explanation and a dimensions object with the height and width pixels. The maskRLE follows a column-major order (Fortran order). |
Text classification models explainability object
Parameter | Type | Description |
---|---|---|
modelType | string | Defines the explanation format. Possible Options: textClassification . |
result | object | Contains the results in a classPredictions list that consists of a prediction and score for each class. |
explanation | object | Contains:wordImportances key/value pair that consists of a list that includes the word , score , and optional index of the word in the original text for each class.If a score is negative, it means the word contributed negatively to that class prediction. optional explainableText key/value pair that consists of a list that includes the word , score , and optional index of the word in the preprocessed text for each class. |
[ Post /shutdown]
The model server process should exit with exit code 0.
Response
The model server is not required to send a response and may simply drop the connection; however, a response is encouraged.
Status codes | |
---|---|
Status 202 | Request accepted. The server process exits after returning the response. |
Status 500 | Unexpected error shutting down the model. |
Updated about 1 year ago