v1 Container Specifications REST

[Deprecated] Modzy v1 Container Spec

Modzy requests the model’s status, runs data, and gets output results from a model container. When needed, it can request to shutdown a container. The container specifications is an API that can respond to these requests. Here we provide a sample specification that uses REST over HTTP/1.1 with JSON to send data, for your reference.

📘

We currently support Docker containers. Support for all OCI-compliant containers is coming soon.

Requirements

Container

The Docker container must expose an HTTP API on the port specified by the PSC_MODEL_PORT environment variable that implements the /status, /run, and /shutdown routes detailed below.

Entry Point

The container entry point is responsible for launching the HTTP server, e.g. when run as:

docker run image

Use the exec syntax in your Dockerfile to define the entry point:

COPY entrypoint.sh ./
ENTRYPOINT ["./entrypoint.sh"]

Inputs and outputs

Inputs hold the input-items sent to the model to be processed. Outputs hold the result files returned by the model. Specify the input and output item names to link the model to input-items and results.

Each model defines a filename for the input and output items. The filenames of the input-items sent must match the model’s input-item names.

For example, the Multi-language OCR model defines its inputs and outputs as follows:

  • the input-item has two data-items: image and config.json,
  • the output is named results.json.

In this case, the path to the input directory contains:

ls /path/to/input/directory
  image config.json

And when the results are available, the output directory contains:

ls /path/to/output/directory
  results.json

Check out our run a model tutorial for more details.

HTTP API specifications

The response DTO

All routes should respond with an application/json type and with this format:

{
  "message": "The call went well or terribly.",
  "status": "OK",
  "statusCode": 200
}

Ensure the message provides useful feedback about model errors.

[Get /status]

Initializes the model and returns its status.

Response

{
  "message": "The call went well.",
  "status": "Ok",
  "statusCode": 200
}
Status codes
Status 200The model is ready to run.
Status 500Unexpected error loading the model.

Batch processing

To set a model for batch processing, the /status route should return a batch processing size. A model’s batch_size is the maximum amount of input items it can process simultaneously while mounted on a GPU.

{
  "batch_size": 64,
  "message": "Model Ready to receive 64 inputs in run route.",
  "status": "Ok",
  "statusCode": 200
}

[Post /run]

Runs the model inference on a given input.

Request

Add the job configuration object (use application/json):

{
  "type": "file",
  "explain": true,
  "input": "/path/to/input/directory",
  "output": "/path/to/output/directory"
}
Parameters
typeThe input’s and output’s types. It can be file or batch.
explain optionalSets the explainability feature when a model offers the option.
inputThe filesystem directory path from which the model should read input items files.
outputThe filesystem directory path where the model should write output data files.

Batch processing

If the model supports batch processing, the run call requires “type”: “batch” and changes input for inputs and output for outputs:

{
  "type": "batch",
  "inputs": ["/path/to/input/directory", "/path/to/other/input/directory"],
  "outputs": ["/path/to/output/directory", "/path/to/other/output/directory"]
}

Response

{
  "message": "Success with errors.",
  "status": "OK",
  "statusCode": 200
}
Status codes
Status 200Successful inference.
Status 400Invalid job configuration object.
The job configuration object is malformed or the expected files do not exist or cannot be read or written.
When running on the platform this should never occur but may be useful for debugging.
Status 415Invalid media type.
The client did not post application/json in the HTTP body.
When running on the platform this should never occur but may be useful for debugging.
Status 422Unprocessable input file.
The model cannot run inference on the provided input files (for example an input file may be the wrong format, too large, too small, etc).
The response message should contain a detailed validation error that explains why the model cannot process a given input file.
Status 500Unexpected error running the model.
{
  "errors": [
    {
      "1": "Error in second input"
    }
  ],
  "message": "Success with errors.",
  "status": "OK",
  "statusCode": 200
}

Explainability

Output files contain inference results. Models with built-in explainability return output files with this structure:

{
  "modelType": "",
  "result": {
    "classPredictions": []
  },
  "explanation": {
    "maskRLE": []
  }
}
{
  "modelType": "textClassification",
  "result": {
    "classPredictions": []
  },
  "explanation": {
    "wordImportances": {},
    "explainableText": {}
  }
}

Image classification models explainability object

ParameterTypeDescription
modelTypestringDefines the explanation format. Possible options: imageClassification, imageSegmentation, objectDetection.
resultobjectContains the results in a classPredictions array.
explanationobjectContains a maskRLE array with the explanation and a dimensions object with the height and width pixels. The maskRLE follows a column-major order (Fortran order).

Text classification models explainability object

ParameterTypeDescription
modelTypestringDefines the explanation format. Possible Options: textClassification.
resultobjectContains the results in a classPredictions list that consists of a prediction and score for each class.
explanationobjectContains:

wordImportances key/value pair that consists of a list that includes the word, score, and optional index of the word in the original text for each class.
If a score is negative, it means the word contributed negatively to that class prediction.

optional explainableText key/value pair that consists of a list that includes the word, score, and optional index of the word in the preprocessed text for each class.

[ Post /shutdown]

The model server process should exit with exit code 0.

Response

The model server is not required to send a response and may simply drop the connection; however, a response is encouraged.

Status codes
Status 202Request accepted.
The server process exits after returning the response.
Status 500Unexpected error shutting down the model.