Model Inference in Batches

In this guide (also available as a Jupyter Notebook or in a Git Repo), we will use the Modzy Python SDK to submit a batch inference to a model. Moreover, we will leverage Kaggle's Amazon Reviews dataset to generate our batch of data (pre-downloaded and cleaned test set can be found here.

For more detailed usage documenation for our Python SDK, visit our GitHub page.

:computer: Environment Set Up

Create a virtual environment (venv, conda, or other preferred virtual environment) with Python 3.6 or newer.

Pip install the following packages in your environment.

  • modzy-sdk>=0.11.3

And install Jupyter Notebooks in your preferred environment using the appropriate install instructions.

:arrow-down: Import Modzy SDK and Initialize Client

Insert your instance URL and personal API Key to establish connection to the Modzy API Client

# Import Libraries
from modzy import ApiClient, error
from pprint import pprint

Initialize Modzy API Client

# the url we will use for authentication
API_URL = "https://trial.app.modzy.com/api"
# the api key we will be using for authentication -- make sure to paste in your personal API access key below
API_KEY = "<your.api.key>"

if API_URL == "https://<your.modzy.url>/api":
    raise Exception("Change the API_URL variable to your instance URL")
if API_KEY == "<your.api.key>":
    raise Exception("Insert your API Key")
    
# setup our API Client
client = ApiClient(base_url=API_URL, api_key=API_KEY)

:mag: Discover Available Models

In this notebook, we will submit a batch of data for inference to the Sentiment Analysis model.

# Query model by name
auto_model_info = client.models.get_by_name("Sentiment Analysis")
pprint(auto_model_info)
{'author': 'Open Source',
 'description': 'This model gives sentiment scores showing the polarity and '
                'strength of the emotions in text.',
 'features': [{'description': 'This model has a built-in explainability '
                              'feature. Click '
                              '[here](https://arxiv.org/abs/1602.04938) to '
                              'read more about model explainability.',
               'identifier': 'built-in-explainability',
               'name': 'Explainable'}],
 'images': [{'caption': 'Sentiment Analysis',
             'relationType': 'background',
             'url': '/modzy-images/ed542963de/image_background.png'},
            {'caption': 'Sentiment Analysis',
             'relationType': 'card',
             'url': '/modzy-images/ed542963de/image_card.png'},
            {'caption': 'Sentiment Analysis',
             'relationType': 'thumbnail',
             'url': '/modzy-images/ed542963de/image_thumbnail.png'},
            {'caption': 'Open Source',
             'relationType': 'logo',
             'url': '/modzy-images/companies/open-source/company-image.jpg'}],
 'isActive': True,
 'isCommercial': False,
 'isRecommended': True,
 'lastActiveDateTime': '2022-05-24T02:11:45.229+00:00',
 'latestActiveVersion': '1.0.1',
 'latestVersion': '1.0.27',
 'modelId': 'ed542963de',
 'name': 'Sentiment Analysis',
 'permalink': 'ed542963de-open-source-sentiment-analysis',
 'snapshotImages': [],
 'tags': [{'dataType': 'Input Type',
           'identifier': 'text',
           'isCategorical': True,
           'name': 'Text'},
          {'dataType': 'Task',
           'identifier': 'label_or_classify',
           'isCategorical': True,
           'name': 'Label or Classify'},
          {'dataType': 'Tags',
           'identifier': 'sentiment_analysis',
           'isCategorical': False,
           'name': 'Sentiment Analysis'},
          {'dataType': 'Tags',
           'identifier': 'text_analytics',
           'isCategorical': False,
           'name': 'Text Analytics'},
          {'dataType': 'Subject',
           'identifier': 'language_and_text',
           'isCategorical': True,
           'name': 'Language and Text'}],
 'versions': ['0.0.28', '1.0.1', '1.0.27', '0.0.27'],
 'visibility': ApiObject({
  "scope": "ALL"
})}
# Define Variables for Inference
MODEL_ID = auto_model_info["modelId"]
MODEL_VERSION = auto_model_info["latestActiveVersion"]
INPUT_FILENAME = list(client.models.get_version_input_sample(MODEL_ID, MODEL_VERSION)["input"]["sources"]["0001"].keys())[0]

:open-file-folder: Create Batch of Data

Modzy allows you to submit either a single piece of data or a batch to your model. In this scenario, we will create a dictionary of data for our batch, where each of the 500 reviews we pull from the test set of 1000 will have its own entry.

with open("test.ft.txt", "r", encoding="utf-8") as f:
    text = f.readlines()[:500] # extract first 500

# clean reviews before feeding to model
text_cleaned = [t.split("__label__")[-1][2:].replace("\n", "") for t in text]
sources = {"review_{}".format(i): {"input.txt": review} for i, review in enumerate(text_cleaned)}

:runner: Submit Inference to Model

Helper Function

Below is a helper function we will use to submit inference jobs to the Modzy platform and return the model output using the submit_text method. For additional job submission methods, visit the Python SDK docs page.

def get_model_output(model_identifier, model_version, data_sources, explain=False):
    """
    Args:
        model_identifier: model identifier (string)
        model_version: model version (string)
        data_sources: dictionary with the appropriate filename --> local file key-value pairs
        explain: boolean variable, defaults to False. If true, model will return explainable result
    """
    job = client.jobs.submit_text(model_identifier, model_version, data_sources, explain)
    result = client.results.block_until_complete(job, timeout=None)        
    return result

Finally, we will leverage our get_model_output helper function, submit the batch of 500 reviews for inferencing, and print the results for the first review to our notebook.

model_results = get_model_output(MODEL_ID, MODEL_VERSION, sources, explain=False)
first_review_results = model_results["results"]["review_0"]["results.json"]["data"]["result"]
pprint(first_review_results)
{'classPredictions': [ApiObject({
  "class": "neutral",
  "score": 0.716
}),
                      ApiObject({
  "class": "positive",
  "score": 0.214
}),
                      ApiObject({
  "class": "negative",
  "score": 0.07
})]}