Microsoft's PowerBI is a business intelligence data visualization tool

Microsoft's PowerBI is a data visualization tool with a primary focus on business intelligence (BI) data. It can combine data from a database, webpage, or structured files such as spreadsheets, CSV, XML, and JSON.

Using Modzy, PowerBI can be extended to process unstructured imagery, video, and audio data, and to apply advanced ML/AI analyses. In this tutorial, you will learn two ways to use AI models to extract data from unstructured sources or process existing data with ML algorithms for use within PowerBI: by modifying a Template App and through a Python Power Query.

Prepare PowerBI

PowerBI requires Python to use AI/ML or Modzy in your visualizations. To install, follow these instructions: <https://docs.microsoft.com/en-us/power-bi/connect-data/desktop-python-scripts>

After installing Python, install Pandas and Modzy's SDK, by running these commands in your Windows Command Prompt or Shell:

py -m pip install pandas
py -m pip install modzy-sdk

Template App

If you are using natural langauge processing (NLP) or quantifying or extracting meaning from tabular data, it may be easiest to start with our Template App, available for download here.

This template will open with 2 data sources:

  • CSV of Amazon Review data, loadeed from GitHub – you may replace this data file with your own data to be processed.
  • JSON schema including data on which model will be used and credentials to use it. You must edit this file to include appropriate credentials of your user or project:
      "base_url": "https://app.modzy.com/api",
      "api_key": "<your.API_KEY>",
      "id_column_name": "review_id",
      "data_column_name": "review_body",
      "model_id": "ed542963de",
      "model_version": "1.0.1",
      "model_input_name": "input.txt",
      "chunk_size": 250,

Below the API key, you may edit the schema to reflect your own data. id_column_name is where a unique ID or index of your data is found. data_column_name is the column of the data to be processed through the model. Modify model_id, model_version and model_input_name to match the specifics of the model used to process the data.

To customize the output format of your data, enter PowerBI's Power Query Editor, select Table and click the gear icon in the Run Python Modzy step. The final block of code starting with for job in alljobs: can be edited to match the output of your particular model.

Upon execution, the model will process the data and generate a new table called Table. This will include all model output for use in dashboard visualizations.

Python Query

The most flexible way to use AI/ML analytics on data within PowerBI is to execute a Python query.

  • Open the dashboard containing the relevant data.
  • Right-click the target table and select 'Edit Query' to open the Advanced Query Editor
  • Select the Transform tab and click the Run Python Script button
  • The Python Script Editor will open. Note that your data is contained in a Pandas DataFrame called 'dataset'
  • Using the python-sdk code example from your model's biography page, fill in the script with something like the example below, replacing uid_column and data_column with the names of the columns appropriate to your data:
from modzy import ApiClient
client = ApiClient(base_url='https://app.modzy.com/api', api_key=<YOUR.API_KEY>)

inputs = {}
for index, row in dataset.iterrows():
    inputs[row['uid_column']] = {
        'input.txt':  row['data_column']

job = client.jobs.submit_text(<MODEL_ID>, <MODEL_VERSION>, inputs)
results = client.results.block_until_complete(job, timeout=None)

for r in results['results']:
	<insert output processing code here>