Resources

Overview

Processing engines

Every account has a running capacity, given by the number of processing engines available to run jobs. Each engine processes one input item at a time. The number of engines available determines the maximum amount of input items a job can process in parallel. Inputs not being processed stand by in the input queue until an engine picks them up.

Set a model’s version processing capacity to manage the number of processing engines the model can use from the account. If all the processing engines are being used to run models, new job requests hold in the queue until engines become available again.

Nodes

Nodes are virtual or physical machines that have resources available to run processing engines. Resources include CPU, GPU, memory, and a maximum number of processing engines that can be scheduled onto the node. Check out Kubernetes Docs for more details. The resources available on a node include:

The processing object

{
    "minimumParallelCapacity": 1,
    "maximumParallelCapacity": 3
}

minimumParallelCapacity

number

The minimum number of processing engines a model’s version can run. It is a positive integer.

maximumParallelCapacity

number

The maximum number of processing engines a model’s version can run. It is a positive integer.

The engines object

{
  "name" : "...",
  "createdAt" : "...",
  "ready" : true
},

name

string

The engine’s name.

createdAt

string

The engines’s creation date in ISO8601 (YYYY-MM-DDThh:mm:ss.sTZD) format.

ready

boolean

The engine’s status.

The model deployment state object

{
  "hasError": false,
  "ready": true,
  "beingMonitored": true
}

hasError

boolean

When true, an error doesn’t allow the engine to start. Modzy still tries to spin it up.

ready

boolean

When true, the engine is ready to process inputs.

beingMonitored

boolean

When true, the API is continuously checking the engine’s status.

The nodes object

{
    "name": "...",
    "creationTimestamp": "...",
    "annotation": [
      {
        "key": "...",
        "value": "..."
      }
    ],
    "labels": [
      {
        "key": "...",
        "value": "..."
      }
    ],
    "status": {
      "allocatable": {
        "memory": "...",
        "cpu": "...",
        "gpu": "...",
        "pods": 10
      },
      "capacity": {
        "memory": "...",
        "cpu": "...",
        "gpu": "...",
        "pods": 10
      },
      "conditions": [],
      "images": []
    }
  }
]

name

string

A node’s name. Nodes may be virtual or physical machines.

creationTimestamp

string

The time when the node was added.

annotation

array

A key-value pair with node metadata.

labels

array

A key-value pair that tags nodes.

status

object

An object that contains the node’s status.

The node status object

{
  "allocatable": {
    "memory": "...",
    "cpu": "...",
    "gpu": "...",
    "pods": 10
  },
  "capacity": {
    "memory": "...",
    "cpu": "...",
    "gpu": "...",
    "pods": 10
  },
  "conditions": [],
  "images": []
}

allocatable

object

A node’s amount of available resources allocatable to processing engines.

capacity

object

A node’s total amount of resources.

conditions

array

Describes the status of all running nodes. Conditions include Ready, DiskPreassure, MemoryPressure, PIDPressure, and NetworkUnavailable.

images

array

The name and size of the containers required to run an application.