Processing Azure Analysis Services with OAuth Sources (like Azure Data Lake Store)

As you probably know from my last blog post, I am currently upgrading the PowerBI reporting platform of one of my customer from a PowerBI backend (dataset hosted in PowerBI service) to an Azure Analysis Services backend. The upgrade/import of the dataset into Azure Analysis Services itself worked pretty flawless and after switching the connection of the reports everything worked as expected and everyone was happy. However, things got a bit tricky when it came to automatically refreshing the Azure Analysis Services database which was based on an Azure Data Lake Store. For the original PowerBI dataset, this was pretty straight forward as a scheduled refresh from an Azure Data Lake store data source works out of the box. For Azure Analysis Services this is a bit different.

When you build and deploy your data model from Visual Studio, your are prompted for the credentials to access ADLS which are then stored in the data source object of AAS. As you probably know, AAS uses OAuth authentication to access data from ADLS. And this also causes a lot of problems. OAuth is based on tokens and those tokens are only valid for a limited time, by default this is 2 hours. This basically means, that you can process your database for the next 2 hours and it will fail later on with an error message saying that the token expired. (The above applies to all OAuth sources!)
This problem is usually solved by using an Azure Service Principal instead of a regular user account where the token does not expire. Unfortunately, this is not supported at the moment for ADLS data sources and you have to work around this issue.

So the current situation that we need to solve is as follows:

  • we can only use regular user accounts to connect AAS to ADLS as service principals are not supported yet
  • the token expires after 2 hours
  • the database has to be processed on a regular basis (daily, hourly, …) without any manual interaction
  • manually updating the token is (of course) not an option

Before you continue here, make sure that you read this blog post first: https://blogs.msdn.microsoft.com/dataaccesstechnologies/2017/09/01/automating-azure-analysis-service-processing-using-azure-automation-account/
It describes the general approach of using Azure Automation to process an Azure Analysis Services model and most of the code in this blog post if based on this!
Also this older blog post will be a good read as some concepts and code snippets are reused here.

Back to our example – as we were already using Azure Automation for some other tasks, we decided to also use it here. Also, PowerShell integrates very well with other Azure components and was the language of choice for us. To accomplish our goal we had to implement 3 steps:

  1. acquire a new OAuth token
  2. update the ADLS data source with the new token
  3. run our processing script

I could copy the code for the first step more or less from one of my older blog post (here) where I used PowerShell to acquire an OAuth token to trigger a refresh in PowerBI.

The second step is to update ADLS data source of our Azure Analysis Services model. To get started, the easiest thing to do is to simply open the AAS database in SQL Server Management Studio and let it script the existing datasource for you: AAS_Script_OAuth_DataSource
The resulting JSON will look similar to this one:

Update ADLS DataSource
{
  "createOrReplace": {
    "object": {
      "database": "Channel Analytics",
      "dataSource": "DS_ADLS"
    },
    "dataSource": {
      "type": "structured",
      "name": "DS_ADLS",
      "connectionDetails": {
        "protocol": "data-lake-store",
        "address": {
          "url": "https://mydatalake.azuredatalakestore.net"
        }
      },
      "credential": {
        "AuthenticationKind": "OAuth2",
        "token_type": "********",
        "scope": "********",
        "expires_in": "********",
        "ext_expires_in": "********",
        "expires_on": "********",
        "not_before": "********",
        "resource": "********",
        "id_token": "********",
        "kind": "DataLake",
        "path": "https://mydatalake.azuredatalakestore.net/",
        "RefreshToken": "********",
        "AccessToken": "********"
      }
    }
  }
}

The important part for us is the “credential” field. It contains all the information necessary to authenticate against our ADLS store. However, most of this information is sensitive so only asterisks are displayed in the script. The rest of the JSON (except for the “credential” field) is currently hardcoded in the PowerShell cmdlet so if you want to use it, you need to change this manually!
The PowerShell cmdlet then combines the hardcoded part with an updated “credential”-field which is obtained by invoking a REST request to retrieve a new OAuth token. The returned object is modified a bit in order to match the required JSON for the datasource.
Once we have our final JSON created, we can send it to our Azure Analysis Services instance by calling the Invoke-ASCmd cmdlet from the SqlServer module.
Again, please see the original blog post mentioned above for the details of this approach.

After we have updated our datasource, we can simply call our regular processing commands which will then be executed using the newly updated credentials.
The script I wrote allows you to specify which objects to process in different ways:

  • whole database (by leaving AASTableName and AASPartitionName empty)
  • a single or multiple table and all its partitions (by leaving only AASPartitionName empty)
  • or multiple partitions of a single table (by specifying exactly one AASTableName and multiple AASPartitionNames

If multiple tables or partitions are specified, the elements are separated by commas (“,”)

So to make the Runbook work in your environment, follow all the initial steps as described in the original blog post from Microsoft. In addition, you also need to create an Application (Type = “Native”) in your Azure Active Directory to obtain the OAuth token programmatically. This application needs the “Sign in and read user profile” permission from the API “Windows Azure Active Directory (Microsoft.Azure.ActiveDirectory)”:
AAD_App_Permissions
Also remember the ApplicationID, it will be used as a parameter for the final PowerShell Runbook (=parameter “ClientID”!
When it comes to writing the PowerShell code, simply use the code from the download at the end of this blog post.

For the actual credential that you are using, make sure that it has the following permissions:

  • to update the AAS datasource (can be set in the AAS model or for the whole server)
  • has access to the required ADLS files/folders which are processed (can be set e.g. via ADLS Data Explorer)
  • (if you previously used your own account to do all the AAS and ADLS development, this should work just fine)

In general, a similar approach should work for all kinds of datasources that require OAuth authentication but so far I have only tested it with Azure Data Lake Store!

Download: AAS_Process_OAuth_Runbook.ps1

Trigger Cube-Processing from Excel using Cube Actions

Recently I faced the requirement to give specific users of my Analysis Services Multidimensional cube the opportunity to process the cube whenever they want. There are several ways to achieve this:
– start a SQL Agent that processes the cube
– create some kind of WebService that does the processing
– create some custom .NET application that does the processing
– run some script (PowerShell, Command Line, ascmd.exe, …)

 


NOTE:
The post describes a solution which requires SSAS Server Admin rights. If you are interested in a solution which triggers an asynchronous processing without requiring SSAS Server Admin Rights please scroll down to the very bottom and download VS_Solution.zip
It basically runs the same script but in a separate thread in the context of the service account so the calling user does not need to have any specific rights on the server.


From a usability point-of-view none of the above is really convenient as all options involve some kind of external process or application. In this post I will show how to process a cube using cube actions which can be triggered from Excel Pivot Tables natively. So this is what we want to achieve:

Excel_Action

 

This requires several steps:
1) create custom .Net assembly
2) add that custom .Net assembly to your Analysis Services database
3) add an action to your cube that calls the assembly

 

Creating a Custom .Net assembly:

First of all we need to create a new “Visual C#” project of type “Class Library”. To work with Analysis Services we need to add two references:
VS_References

“Microsoft.AnalysisServices” refers to Microsoft.AnalysisServices.dll which  can be found in the shared features folder of your installation (default is c:Program FilesMicrosoft SQL Server110SDKAssemblies)

“msmgdsrv” refers to msmgdsrv.dll which can found be in the OLAPbin-folder of your SSAS instance (default is c:Program FilesMicrosoft SQL ServerMSAS11.MSSQLSERVEROLAPbin)

Once these references are added, we can start to create our processing code:

using System;
using AdomdServer = Microsoft.AnalysisServices.AdomdServer;
using Microsoft.AnalysisServices;
using System.Data;

namespace ASSP_Processing
{
    public class Processing
    {
        public const string LoggingPrefix = “ASSP.ProcessObject: “;

        public enum ResultHandling
        {
            Datatable = 0,
            Exception = 1,
            NULL = 2
        }

        public static DataTable ProcessObject(string cubeName, ResultHandling resultHandling)
        {
            DataTable ret = new DataTable();
            ret.Columns.Add(“Result”);

            Server server = null;

            try
            {
                server = new Server();

                //connect to the current session…
                server.Connect(“*”);

                AdomdServer.Context.CheckCancelled(); //could be a bit long running, so allow user to cancel

                Database db = server.Databases.GetByName(AdomdServer.Context.CurrentDatabaseName);

                string objectToProcessName = “#N/A”;
                ProcessableMajorObject objectToProcess = db;

                AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Database <“ + db.Name + “> found!”);
                objectToProcessName = “DB[“ + db.Name + “]”;

                if (!string.IsNullOrEmpty(cubeName))
                {
                    Cube cube = db.Cubes.GetByName(cubeName);

                    if (cube != null)
                    {
                        objectToProcess = cube;
                        AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Cube <“ + cubeName + “> found!”);
                        objectToProcessName = objectToProcessName + ” > Cube[“ + cubeName + “]”;
                    }
                }

                if (objectToProcess != null)
                {
                    AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Processing Object “ + objectToProcessName + ” …”);

                    objectToProcess.Process(ProcessType.ProcessFull);

                    ret.Rows.Add(new object[] { “S U C C E S S:    Object “ + objectToProcessName + ” successfully processed!” });
                    AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Finished Processing Object “ + objectToProcessName + “!”);
                }
            }
            finally
            {
                try
                {
                    if (server != null)
                        server.Disconnect();
                }
                catch { }
            }

            // if processing was successful a row has been added beforehand
            if (ret.Rows.Count == 0)
            {
                ret.Rows.Add(new object[] { “F A I L U R E:    Error while processing an object!” });
            }

            switch (resultHandling)
            {
                case ResultHandling.Datatable:
                    return ret;

                case ResultHandling.Exception:
                    throw new Exception(Environment.NewLine + Environment.NewLine + Environment.NewLine + ret.Rows[0][0].ToString());

                case ResultHandling.NULL:
                    return null;
            }

            return null;
        }

        public static DataTable ProcessObject(string cubeName, int resultHandling)
        {
            return ProcessObject(cubeName, (ResultHandling)resultHandling);
        }

        public static DataTable ProcessObject(string cubeName)
        {
            return ProcessObject(cubeName, 1);
        }
    }
}

There are two things here that are worth to point out. First of all we need to somehow establish a connection to the server that hosts the SSAS database in order to process an object. The first thing that comes into mind would be to create a new connection to the server and run some processing XMLA. Unfortunately this does not work here because as this would result in a deadlock. The assembly will be called as a Rowset-Action which in terms of locking is similar to a query. So if we run separate processing command within our Rowset-Action this processing command can never be committed as there is always a query running on the database. The query (=Rowset-Action) will wait until the function is finished and the function waits until the processing is committed resulting in a classical deadlock!

To avoid this we need to connect to the current users sessions using “*” as our connection string:

//connect to the current session…
server.Connect(“*”);

 

The second thing to point out is the return value of our function. In order to be used in a Rowset-action, the function has to return a DataTable object. The results would then be displayed similar to a drill through and a separate worksheet would pop up in Excel showing the result:

Result_DataTable

From my point-of-view this can be a bit annoying as after each processing this new worksheet pops up and you loose the focus of the pivot table. So I investigated a bit and found another way to display the output of the processing.

When a drill through / action throws an exception, this will also be displayed in Excel without actually displaying the result of the drill through / action in a separate worksheet. By default it may look like this message:

Result_NULL

Well, not really much information here right?

To display more information, e.g. that the processing was successful (or not) we can throw our own exception in the code after processing is done:

Result_Exception

I added a separate parameter to control this behavior, in the code above the default would be option 3 – custom Exception.

 

Once this project is built the DLL can be added to the SSAS database. It is necessary to grant the assembly “Unrestricted” permission set:

AssemblySettings

 

The last step would be to create the appropriate cube actions that call the assembly:

CubeActionConfiguration

It is necessary that the action is of Type=Rowset, otherwise it would not be displayed in Excel! The Action Target is up the you, in this example it will be displayed on every cell. The Action expression is the actual call to our assembly passing the cube name as a parameter. The second parameter controls how the result is displayed and could also be omitted.  In this example I added 3 separate actions, one for each result-type (as described above).

And that’s it! This simple steps allow you and your users to process any cube (or any other database object) from within your favorite client application, assuming it supports cube actions.

This is the final result where the three cube actions can be triggered from an Excel pivot table:

Excel_Action_Final

 

The attached zip-file includes everything that is necessary to recreate this example:
– SSAS project
– ASSP_Processing project

CubeProcessingAction.zip

Solution for Asynchronous Processing without Admin-Rights:
VS_Solution.zip