Score whole PowerBI DataSets dynamically in Azure ML

One of the most requested features when it comes to Azure ML is and has always been the integration into PowerBI. By now we are still lacking a native connector in PowerBI which would allow us to query a published Azure ML web service directly and score our datasets. Reason enough for me to dig into this issue and create some Power Query M scripts to do this. But lets first start off with the basics of Azure ML Web Services.

Every Azure ML project can be published as a Web Service with just a single click. Once its published, it can be used like any other Web Service. Usually we would send a record or a whole dataset to the Web Service, the Azure ML models does some scoring (or any other operation within Azure ML) and then sends the scored result back to the client. This is straight forward and Microsoft even supplies samples for the most common programming languages. The Web Service relies on a standardized REST API which can basically be called by any client. Yes, in our case this client will be PowerBI using Power Query.
Rui Quintino has already written an article on AzureML Web Service Scoring with Excel and Power Query and also Chris Webb wrote a more generic one on POST Request in Power Query in general Web Service and POST requests in Power Query. Even Microsoft recently published an article how you can use the R Integration of Power Query to call a Azure ML Web Service here.

Having tried these solutions, I have to admit that they have some major issues:
1) very static / hard coded
2) complex to write
3) operate on row-by-row basis and might run into the API Call Limits as discussed here.
4) need a local R installation

As Azure ML usually deal with tables, which are basically Power Query DataSets, a requirement would be to directly use a Power Query DataSet. The DataSet has to be converted dynamically into the required JSON structure to be POSTed to Azure ML. The returned result, usually a table again, should be converted back to a Power Query DataSet. And that’s what I did, I wrote a function that does all this for you. All information that you have to supply can be found in the configuration of your Azure ML Web Service:
– Request URI of your Web Service
– API Key
– the [Table to Score]

the [Table to Score] can be any Power Query table but of course has to have the very same structure (including column names and data types) as expected by the Web Service Input. Then you can simply call my function:
InvokeFunction_PowerBI_DataSet_in_AzureML
Score_PowerBI_DataSet_in_AzureML

The whole process involves a lot of JSON conversions and is kind of complex but as I encapsulated everything into M functions it should be quite easy to use by simply calling the CallAzureMLService-function.

However, here is a little description of the used functions:
ToAzureMLJson – converts any object that is passed in as an argument to a JSON element. If you pass in a table, it is converted to a JSON-array. Dates and Numbers are formatted correctly, etc. so the result can the be passed directly to Azure ML.

AzureMLJsonToTable – converts the returned JSON back to a Power Query Table. It obeys column names and also data types as defined in the Azure ML Web Service output. If the output changes (e.g. new columns are added) this will be taken care of dynamically!

CallAzureMLService – uses the two function from above to convert a table to JSON, POST the JSON to Azure ML and convert the result back to a Power Query Table.

Known Issues:
As the [Table to Score] will probably come from a SQL DB or somewhere else, you may run into issues with Privacy Levels/Settings and the Formula Firewall. In this case make sure to enable Fast Combine for your workbook as described here.

The maximum timeout of a Request/Response call to an Azure ML Web Service is 100 seconds. If your call exceeds this limit, you might get an error message returned.I ran a test and tried to score 60k rows (with 2 numeric columns) at once and it worked just fine, but I would assume that you can run into some Azure ML limits here very easily with bigger data sets. As far as I know, these 100 seconds are for the Azure ML itself only. If it takes several minutes to upload your dataset in the POST request, than this is not part of this 100 seconds. If you are still hitting this issue, you could further try to split your table into different batches, score them separately and combine the results again afterwards.

 

So these are the steps that you need to do in order to use your Azure ML Web Service together with PowerBI:
1) Create an Azure ML Experiment (or use an existing)
2) Publish the Experiment as a Web Service
3) note the URL and the API Key of your Web Service
4) run PowerBI and load the data that you want to score
5) make sure that the dataset created in 4) has the exact same structure as expected by Azure ML (column names, data types, …)
6) call the function “CallAzureMLWebService” with the parameters from 3) and 5)
7) wait for the Web Service to return the result set
8) load the final table into PowerBI (or do some further transformations before)

And that’s it!

Download:
You can find a PowerBI workbook which contains all the functions and code here: CallAzureMLWebService.pbix
I used a simple Web Service which takes 2 numeric columns (“Number1” and “Number2”) and returns the [Number1] * [Number2] and [Number1] / [Number2]

PS: you will not be able to run the sample as it is as I changed the API Key and also the URL of my original Azure ML Web Service

8 thoughts on “Score whole PowerBI DataSets dynamically in Azure ML

  1. Hi there
    Ive been looking for exactly this! So glad Ive found this.
    One question tho, what if I subscribe Cluster API via data.marketplace, how do I consume that API via PowerBI? In effect I wont have API Key and URL.
    Hope you can advice.

    Thanks
    Peddie

    • Hi Ped,
      I have not worked with the Azure Marketplace API yet but I wonder why you would need to send/push data to them?
      There is a pre-built Connector for Microsoft Azure Marketplace, why does this one not work?

      or are you talking about a published Azure ML webservice on the marketplace which you can use to score your own data?

      -gerhard

  2. Thanks for this Gerhard. I have an interesting problem with my implementation of this. First, I am using a custom query to call the Twitter API based on a search term that is a parameter. Once I have transformed the response into the Power Query table with the correct labels/data types, I am using your code above to call my Azure ML text sentiment analysis service. Everything works fine in the query editor. However, when I change the search term or try to close and apply, I get a 400 Bad Request error. At first I thought maybe there was some non-English text, but that doesn’t seem to be the issue. What is really bizarre is that my results load into query editor (I can see the returned table with my experiment output columns as they should be), but fail to load to the dashboard. Any thoughts? I am happy to share my PBIX with you.

    • In general there should be no difference whether you run it at design time or when you actually load the data into the datamodel.
      just send me/share the pbix file so i can take a closer look and investigate into the issue.

      Kind regards,
      -gerhard

      • I’ve found that if I import the Tweets from an actual file, rather than pull them into PBI directly via API, the Azure ML web call works fine. However, if I pull the tweets in via API and then send them to Cognitive Services, it works fine.

        Another interesting problem that I can’t figure out (but will try to add in the Wait function you reference) is that my Azure ML web call seems to be limited to 2199 tweets being sent. If I keep top rows at 2199, I get a successful web call with my sentiment scores, but if I change that Keep Top Rows to 2200+, I get a 400 error.

          • I figured it out after examining tweet #2200 and some others that seemed to be causing the error. I had tried to clean the tweets using the built in Clean function, but I discovered that some tweets had a “\” symbol, and when removed, the tweets are successfully analyzed by the ML model. Perhaps the back slash impacts the feature hashing in the model.

          • Hi David,
            thanks for the Feedback!
            what you are saying is that the issue relates to Azure ML and not PowerBI – right?
            Have you had success with the batching? I was thinking of this also but did not have any time yet to do some more investigation

            -gerhard

Leave a Reply