Monitoring Azure Data Factory using PowerBI

Some time ago Microsoft released the first preview of a tool which allows you to monitor and control your Azure Data Factory (ADF). It is a web UI where you can select your Pipeline, Activity or Dataset and check its execution state over time. However, from my very personal point of view the UI could be much better, especially much clearer(!) as it is at the moment. But that’s not really a problem as the thing I like the most about ADF is that its quite open for developers (for example Custom C#/.Net Activities) and it also offers a quite comprehensive REST API to control an manage it.
For our monitoring purposes we are mainly interested in the LIST interface but we could do basically every operation using this API. In my example I only used the Dataset API, the Slices API and the Pipeline API.

First we start with the Dataset API to get a list of all data sets in our Data Factory. This is quite simple as we just need to build our URL of the REST web service like this:

  1. https://management.azure.com/subscriptions/{SubscriptionID}/resourcegroups/{ResourceGroupName}/providers/Microsoft.DataFactory/datafactories/{DataFactoryName}/datasets?api-version={api-version}

You can get all of this information for the Azure Portal by simply navigating to your Data Factory and checking the URL which will be similar to this one:

  1. https://portal.azure.com/#resource/subscriptions/1234567832324a04a0a66e44bf2f5d11/resourceGroups/myResourceGroup/providers/Microsoft.DataFactory/dataFactories/myDataFactory

So this would be my values for the API Call:
– {SubscriptionID} would be “12345678-3232-4a04-a0a6-6e44bf2f5d11”
– {ResourceGroupName} would be “myResourceGroup”
– {DataFactoryName} would be “myDataFactory”
– {api-version} would be a fixed value of “2015-10-01”

Once you have your URL you can use PowerBI to query the API using Get Data –> From Web
Next you need to authenticate using your Personal or Organizational Account – the same that you use to sign in to the Portal – and also the level for which you want to use the credentials. I’d recommend you to set it either to the subscription level or to the data-factory itself, depending on your security requirements. This ensures that you are not asked for credentials for each different API:
ADF_PowerBI_Authentication

This works in a very similar way also for the Slices API, the Pipeline API and all other APIs available! The other transformations I used are regular PowerQuery/M steps done via the UI so I am not going to describe them in more detail here. Also, setting up the relationships in our final PowerPivot model should be straight forward.

Now that we have all the required data in place, we can start with our report. I used some custom visuals for the calendar view, some slicers and a simple table to show the details. I also used a Sankey Chart to visualize the dependencies between the datasets.

ADF_PowerBI_Monitoring_Dashboard
ADF_PowerBI_Monitoring_Dependencies

Compared to the standard GUI for monitoring this provides a much better overview of slices and their current states and it also allows easy filtering. I am sure there are a lot of other PowerBI visualizations which would make a lot of sense here, these are just to give you an idea how it could look like, but of course you have all the freedom PowerBI offers you for reporting!

The only drawback at the moment is that you cannot reschedule/reset slices from PowerBI but for my monitoring-use-case this was not a problem at all. Also, I did not include the SliceRun API in my report as this would increase the size of the data model a lot, so detailed log information is not available in my sample report.

The whole PowerBI template is available for download on my GitHub site: https://github.com/gbrueckl/Azure.DataFactory.PowerBIMonitor

9 Replies to “Monitoring Azure Data Factory using PowerBI”

  1. Hello,

    I want to get in touch with you. I need some help regarding this report.
    Please let me know your skype id or phone number so that we can talk.

    Thanks for your help.

  2. I dont see the option – ‘you can use PowerBI to query the API using Get Data –> From Web’
    Does this document apply to the current version? I am trying to avoid Monitor & Manage option and still be able to see the Pipeline activity. I tried to use the Azure Log content pack but never succeeded. Any help is appreciated…Thanks

    • I just checked in the latest version of PBI Desktop – if you go to Get Data in the ribbon, there is the option “Web” – simply build the correct URL as described in the post and paste it there.
      the whole blog post is for Azure Data Factory Version 1! it does not work with ADF v2 as it uses a different API.
      I am working on a monitoring solution for Azure Data Factory v2 using Power BI but I have not ETA yet

      -gerhard

        • Hi Shalabh,

          unfortunately ADFv2 does not provide the status of a dataset for each slice as it was provided in ADFv1. Therefore the monitoring solution currently would/can not work the same way as in ADFv1. I did some prototyping already but could not find a proper way to do monitoring using the information that is provided by the API.
          If I come up with a proper solution for ADFv2, I will write a dedicated blog post about it

          regards,
          -gerhard

  3. Hi Gerhard,

    I tried to use the method you have shared to monitor my pipelines in ADF but i am facing the following issues :

    a. I am able to fetch only 35 pipelines and 100 data sets.
    b. Pipelines and data sets fetched are not in a particular order ,it is just picking them in a random order.

    Actually the solution I am working on is having approximately 250 pipelines and 1000 datasets.
    Is there any way to pull all of them in powerbi that to in a particular order as i have to perform some operations on the data fetched.

    Regards
    Aditya

    • Hi,

      Well, by default the ADF API only returns 100 elements. If you have more than 100 elements (e.g. datasets), then the API returns a “nextLink” property containing a link where you get the next 100 elements and so on. Here is the M-Code that does this recursively:

      let
      ReadRestAPIRecursive = (
      URL as text,
      optional NextLinkName as text
      ) as list =>
      let
      //URL = “https://management.azure.com/subscriptions/9b58aada-e077-4b89-a95e-85d0cd66b941/resourcegroups/ewCustomerAnalyticsHub/providers/Microsoft.DataFactory/datafactories/ewCAH-DataFactory/datasets/DW_Stage_Responsys_Complaint/slices?start=2014-12-31&end=2017-01-05&api-version=2015-10-01”,
      //NextLinkName = null,
      NextLink = if NextLinkName = null then “nextLink” else NextLinkName,
      ApiResult = Json.Document(Web.Contents(URL)),
      NextLinkUrl = try Record.Field(ApiResult, NextLink) otherwise null,
      CombinedResults = if NextLinkUrl = null then {ApiResult} else List.Combine({{ApiResult}, ReadRestAPIRecursive(NextLinkUrl, NextLink)}) in
      CombinedResults
      in
      ReadRestAPIRecursive

      This should solve your issue

      I will also update the template on GitHub accordingly!

      Kind regards,
      -gerhard

Leave a Reply