Container specifications REST

Modzy requests the model’s status, runs data, and gets output results from a model container. When needed, it can request to shutdown a container. The container specifications is an API that can respond to these requests. Here we provide a sample specification that uses REST over HTTP/1.1 with JSON to send data, for your reference.

📘

We currently support Docker containers. Support for all OCI-compliant containers is coming soon.

Requirements

Container

The Docker container must expose an HTTP API on the port specified by the PSC_MODEL_PORT environment variable that implements the /status, /run, and /shutdown routes detailed below.

Entry Point

The container entry point is responsible for launching the HTTP server, e.g. when run as:

docker run image

Use the exec syntax in your Dockerfile to define the entry point:

COPY entrypoint.sh ./
ENTRYPOINT ["./entrypoint.sh"]

Inputs and outputs

Inputs hold the input-items sent to the model to be processed. Outputs hold the result files returned by the model. Specify the input and output item names to link the model to input-items and results.

Each model defines a filename for the input and output items. The filenames of the input-items sent must match the model’s input-item names.

For example, the Multi-language OCR model defines its inputs and outputs as follows:

  • the input-item has two data-items: image and config.json,
  • the output is named results.json.

In this case, the path to the input directory contains:

ls /path/to/input/directory
  image config.json

And when the results are available, the output directory contains:

ls /path/to/output/directory
  results.json

Check out our run a model tutorial for more details.

HTTP API specifications

The response DTO

All routes should respond with an application/json type and with this format:

{
  "message": "The call went well or terribly.",
  "status": "OK",
  "statusCode": 200
}

Ensure the message provides useful feedback about model errors.

[Get /status]

Initializes the model and returns its status.

Response

{
  "message": "The call went well.",
  "status": "Ok",
  "statusCode": 200
}

Status codes

Status 200

The model is ready to run.

Status 500

Unexpected error loading the model.

Batch processing

To set a model for batch processing, the /status route should return a batch processing size. A model’s batch_size is the maximum amount of input items it can process simultaneously while mounted on a GPU.

{
  "batch_size": 64,
  "message": "Model Ready to receive 64 inputs in run route.",
  "status": "Ok",
  "statusCode": 200
}

[Post /run]

Runs the model inference on a given input.

Request

Add the job configuration object (use application/json):

{
  "type": "file",
  "explain": true,
  "input": "/path/to/input/directory",
  "output": "/path/to/output/directory"
}

Parameters

type

The input’s and output’s types. It can be file or batch.

explain optional

Sets the explainability feature when a model offers the option.

input

The filesystem directory path from which the model should read input items files.

output

The filesystem directory path where the model should write output data files.

Batch processing

If the model supports batch processing, the run call requires “type”: “batch” and changes input for inputs and output for outputs:

{
  "type": "batch",
  "inputs": ["/path/to/input/directory", "/path/to/other/input/directory"],
  "outputs": ["/path/to/output/directory", "/path/to/other/output/directory"]
}

Response

{
  "message": "Success with errors.",
  "status": "OK",
  "statusCode": 200
}

Status codes

Status 200

Successful inference.

Status 400

Invalid job configuration object.
The job configuration object is malformed or the expected files do not exist or cannot be read or written.
When running on the platform this should never occur but may be useful for debugging.

Status 415

Invalid media type.
The client did not post application/json in the HTTP body.
When running on the platform this should never occur but may be useful for debugging.

Status 422

Unprocessable input file.
The model cannot run inference on the provided input files (for example an input file may be the wrong format, too large, too small, etc).
The response message should contain a detailed validation error that explains why the model cannot process a given input file.

Status 500

Unexpected error running the model.

{
  "errors": [
    {
      "1": "Error in second input"
    }
  ],
  "message": "Success with errors.",
  "status": "OK",
  "statusCode": 200
}

Explainability

Output files contain inference results. Models with built-in explainability return output files with this structure:

{
  "modelType": "",
  "result": {
    "classPredictions": []
  },
  "explanation": {
    "maskRLE": []
  }
}
{
  "modelType": "textClassification",
  "result": {
    "classPredictions": []
  },
  "explanation": {
    "wordImportances": {},
    "explainableText": {}
  }
}

Image classification models explainability object

Parameter

Type

Description

modelType

string

Defines the explanation format. Possible options: imageClassification, imageSegmentation, objectDetection.

result

object

Contains the results in a classPredictions array.

explanation

object

Contains a maskRLE array with the explanation and a dimensions object with the height and width pixels. The maskRLE follows a column-major order (Fortran order).

Text classification models explainability object

Parameter

Type

Description

modelType

string

Defines the explanation format. Possible Options: textClassification.

result

object

Contains the results in a classPredictions list that consists of a prediction and score for each class.

explanation

object

Contains:

wordImportances key/value pair that consists of a list that includes the word, score, and optional index of the word in the original text for each class.
If a score is negative, it means the word contributed negatively to that class prediction.

optional explainableText key/value pair that consists of a list that includes the word, score, and optional index of the word in the preprocessed text for each class.

[ Post /shutdown]

The model server process should exit with exit code 0.

Response

The model server is not required to send a response and may simply drop the connection; however, a response is encouraged.

Status codes

Status 202

Request accepted.
The server process exits after returning the response.

Status 500

Unexpected error shutting down the model.


Did this page help you?