Trigger Cube-Processing from Excel using Cube Actions

Recently I faced the requirement to give specific users of my Analysis Services Multidimensional cube the opportunity to process the cube whenever they want. There are several ways to achieve this:
– start a SQL Agent that processes the cube
– create some kind of WebService that does the processing
– create some custom .NET application that does the processing
– run some script (PowerShell, Command Line, ascmd.exe, …)

 


NOTE:
The post describes a solution which requires SSAS Server Admin rights. If you are interested in a solution which triggers an asynchronous processing without requiring SSAS Server Admin Rights please scroll down to the very bottom and download VS_Solution.zip
It basically runs the same script but in a separate thread in the context of the service account so the calling user does not need to have any specific rights on the server.


From a usability point-of-view none of the above is really convenient as all options involve some kind of external process or application. In this post I will show how to process a cube using cube actions which can be triggered from Excel Pivot Tables natively. So this is what we want to achieve:

Excel_Action

 

This requires several steps:
1) create custom .Net assembly
2) add that custom .Net assembly to your Analysis Services database
3) add an action to your cube that calls the assembly

 

Creating a Custom .Net assembly:

First of all we need to create a new “Visual C#” project of type “Class Library”. To work with Analysis Services we need to add two references:
VS_References

“Microsoft.AnalysisServices” refers to Microsoft.AnalysisServices.dll which  can be found in the shared features folder of your installation (default is c:Program FilesMicrosoft SQL Server110SDKAssemblies)

“msmgdsrv” refers to msmgdsrv.dll which can found be in the OLAPbin-folder of your SSAS instance (default is c:Program FilesMicrosoft SQL ServerMSAS11.MSSQLSERVEROLAPbin)

Once these references are added, we can start to create our processing code:

using System;
using AdomdServer = Microsoft.AnalysisServices.AdomdServer;
using Microsoft.AnalysisServices;
using System.Data;

namespace ASSP_Processing
{
    public class Processing
    {
        public const string LoggingPrefix = “ASSP.ProcessObject: “;

        public enum ResultHandling
        {
            Datatable = 0,
            Exception = 1,
            NULL = 2
        }

        public static DataTable ProcessObject(string cubeName, ResultHandling resultHandling)
        {
            DataTable ret = new DataTable();
            ret.Columns.Add(“Result”);

            Server server = null;

            try
            {
                server = new Server();

                //connect to the current session…
                server.Connect(“*”);

                AdomdServer.Context.CheckCancelled(); //could be a bit long running, so allow user to cancel

                Database db = server.Databases.GetByName(AdomdServer.Context.CurrentDatabaseName);

                string objectToProcessName = “#N/A”;
                ProcessableMajorObject objectToProcess = db;

                AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Database <“ + db.Name + “> found!”);
                objectToProcessName = “DB[“ + db.Name + “]”;

                if (!string.IsNullOrEmpty(cubeName))
                {
                    Cube cube = db.Cubes.GetByName(cubeName);

                    if (cube != null)
                    {
                        objectToProcess = cube;
                        AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Cube <“ + cubeName + “> found!”);
                        objectToProcessName = objectToProcessName + ” > Cube[“ + cubeName + “]”;
                    }
                }

                if (objectToProcess != null)
                {
                    AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Processing Object “ + objectToProcessName + ” …”);

                    objectToProcess.Process(ProcessType.ProcessFull);

                    ret.Rows.Add(new object[] { “S U C C E S S:    Object “ + objectToProcessName + ” successfully processed!” });
                    AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Finished Processing Object “ + objectToProcessName + “!”);
                }
            }
            finally
            {
                try
                {
                    if (server != null)
                        server.Disconnect();
                }
                catch { }
            }

            // if processing was successful a row has been added beforehand
            if (ret.Rows.Count == 0)
            {
                ret.Rows.Add(new object[] { “F A I L U R E:    Error while processing an object!” });
            }

            switch (resultHandling)
            {
                case ResultHandling.Datatable:
                    return ret;

                case ResultHandling.Exception:
                    throw new Exception(Environment.NewLine + Environment.NewLine + Environment.NewLine + ret.Rows[0][0].ToString());

                case ResultHandling.NULL:
                    return null;
            }

            return null;
        }

        public static DataTable ProcessObject(string cubeName, int resultHandling)
        {
            return ProcessObject(cubeName, (ResultHandling)resultHandling);
        }

        public static DataTable ProcessObject(string cubeName)
        {
            return ProcessObject(cubeName, 1);
        }
    }
}

There are two things here that are worth to point out. First of all we need to somehow establish a connection to the server that hosts the SSAS database in order to process an object. The first thing that comes into mind would be to create a new connection to the server and run some processing XMLA. Unfortunately this does not work here because as this would result in a deadlock. The assembly will be called as a Rowset-Action which in terms of locking is similar to a query. So if we run separate processing command within our Rowset-Action this processing command can never be committed as there is always a query running on the database. The query (=Rowset-Action) will wait until the function is finished and the function waits until the processing is committed resulting in a classical deadlock!

To avoid this we need to connect to the current users sessions using “*” as our connection string:

//connect to the current session…
server.Connect(“*”);

 

The second thing to point out is the return value of our function. In order to be used in a Rowset-action, the function has to return a DataTable object. The results would then be displayed similar to a drill through and a separate worksheet would pop up in Excel showing the result:

Result_DataTable

From my point-of-view this can be a bit annoying as after each processing this new worksheet pops up and you loose the focus of the pivot table. So I investigated a bit and found another way to display the output of the processing.

When a drill through / action throws an exception, this will also be displayed in Excel without actually displaying the result of the drill through / action in a separate worksheet. By default it may look like this message:

Result_NULL

Well, not really much information here right?

To display more information, e.g. that the processing was successful (or not) we can throw our own exception in the code after processing is done:

Result_Exception

I added a separate parameter to control this behavior, in the code above the default would be option 3 – custom Exception.

 

Once this project is built the DLL can be added to the SSAS database. It is necessary to grant the assembly “Unrestricted” permission set:

AssemblySettings

 

The last step would be to create the appropriate cube actions that call the assembly:

CubeActionConfiguration

It is necessary that the action is of Type=Rowset, otherwise it would not be displayed in Excel! The Action Target is up the you, in this example it will be displayed on every cell. The Action expression is the actual call to our assembly passing the cube name as a parameter. The second parameter controls how the result is displayed and could also be omitted.  In this example I added 3 separate actions, one for each result-type (as described above).

And that’s it! This simple steps allow you and your users to process any cube (or any other database object) from within your favorite client application, assuming it supports cube actions.

This is the final result where the three cube actions can be triggered from an Excel pivot table:

Excel_Action_Final

 

The attached zip-file includes everything that is necessary to recreate this example:
– SSAS project
– ASSP_Processing project

CubeProcessingAction.zip

Solution for Asynchronous Processing without Admin-Rights:
VS_Solution.zip

SSAS Dynamic Security and Visual Totals

Security is always an important aspect of any BI solution. Especially for big enterprise solutions the security-concept can become very complex. Analysis Services Multidimensional  in general offers two option on how to define security: Cell Data Permissions and Dimension Data Permissions. Because of the huge impact on performance Cell Data Permissions are barely used. Dimension Data Permissions are usually the way to go and can cover probably 98% of the security requirements. Though, also Dimension Data Permissions can get quite complex, especially for large-scale cubes with 100+ or even 1000+ users. Just think of an Analysis Services that holds sales data associated to KeyAccounts – further assume that we have 100 KeyAccounts. Each of this KeyAccounts is managed by one user, so we would end up creating 100 roles – one for each KeyAccount and its manager. In terms of maintenance this can get quite complex and in the long run unmanageable.

To address this issue Analysis Services offers the concept of Dynamic Security. Dynamic Security is evaluated at runtime for each user and allows you to consolidate your roles. So for our example from above instead of having 100 different roles we would end up having one dynamic role serving all 100 users. Basically, when a user connects it queries internal data (within the cube using MDX) or external data (using Analysis Services Stored Procedure (ASSP)/Assemblies) to get the security settings for the current user. Especially in combination with ASSP this can be quite powerful as you can use any datasource that you can query using .Net-code to derive your security from.

To make a role "dynamic" you basically have two options:
1) Username() – returns the current users in format "MyDomainMyUser"
2) CustomData() – returns whatever was specified in the connectionstring’s CustomData-property

Both are native MDX-functions and can be used in an expression to build your Allowed Set, Denied Set or DefaultMember.

Role_UI

This works just fine, as long as the expressions returns a valid MDX set/member of course.
But what about the Visual Totals – setting? You can only check or uncheck the checkbox and you are not able to use any dynamic expressions. That’s what the UI offers you – no more no less. But there might be some rare scenarios where you also need to make the Visual Totals dynamically based on Username() or CustomData() – and this is what this post is about.

 

So, as we have already seen above, it is not possible to make the Visual Totals dynamically using the UI. Fortunately there are also other ways besides the UI to work with the cube and modify its structure. I am talking about XMLA (XML for Analysis Services) here, which is the protocol that is used for all communications between a client and Analysis Services. Instead of writing XMLA on your own there is also a programmatically way to do this: AMO (Analysis Management Object). AMO is a .Net library that allows you to integrate calls to Analysis Services within any .Net code. After modifying any object in the code AMO generates the according XMLA for you and sends it to the server which is a much more convenient way than writing plane XMLA on your own.

This means that you can also create/modify your security roles using AMO – this approach is already very well described by Benny Austin in his blog about SSAS: Using AMO to Secure Analysis Service Cube. I recommend reading this article before you continue here.

For this little example I create a slimmed down version of Adventure works. The model contains only the Date and the Product dimension. We will put our dynamic security on the Category attribute of our Product dimension. The model also already contains a predefined empty role called "DynamicVisualTotals" that we will modify later (after the model has been deployed) using AMO. Please note here that AMO only work with the Analysis Service directly (online) but not at design time in Visual Studio. Once everything is deployed this is the AMO code to use to modify the predefined role:

using Microsoft.AnalysisServices;

namespace ModifyRole_AMO
{
    class Program
    {
        static void Main(string[] args)
        {
            using (Server oServer = new Server())
            {
                oServer.Connect("localhost");

                using (Database oDB = oServer.Databases.GetByName("DynamicVisualTotals"))
                {
                    Role oRole = oDB.Roles.GetByName("DynamicVisualTotals");

                    Dimension oDim = oDB.Dimensions.GetByName("Product");
                    DimensionAttribute oAttr = oDim.Attributes.FindByName("Category");
                    DimensionPermission oDimPermission = oDim.DimensionPermissions.FindByRole(oRole.ID);

                    AttributePermission oAttrPermission = oDimPermission.AttributePermissions.Find(oAttr.ID);

                    // can be done in the UI
                    oAttrPermission.AllowedSet = "DynamicSecurity.DynamicAllowedSet(CustomData())";
                    // can not be done in the UI
                    oAttrPermission.VisualTotals = "DynamicSecurity.DynamicVisualTotals(CustomData())";

                    oDimPermission.Update();
                }
            }
        }
    }
}

1) connect to server "localhost"
2) connect to database "DynamicVisualTotals"
3) get the Role-object for role "DynamicVisualTotals" from the database
4) get the Dimension-object for dimension "Product"
5) get the DimensionAttribute-object for attribute "Category"
6) get the DimensionPermission that is associated to our Role for dimension "Product"
7) get the AttributePermission that is associated to our attribute "Category"

So far this is straight forward. The AttributePermission object actually represents the UI that we have seen above. It has properties for all the fields and textboxes on the UI. The most important of course are:
– AllowedSet
– DeniedSet
– Defaultmember
– and VisualTotals

For the first three I was already aware that they are of type String but the VisualTotals I expected to be Boolean which turned out to be wrong – it is also a String!

VisualTotalsIsString 

Naïve as I am I simply put a MDX expression in there that resolves to "1" or "0" (Boolean, similar to the Checkbox in the UI, it could probably also resolve to "true" or "false" or a Boolean type directly but I did not further test this) – and, it worked like a charm! The MDX expression I used refers to an ASSP called DynamicSecurity which contains a function called "DynamicVisualTotals" that returns "1" or "0" based on what was passed in the connectionstring using CustomData():

using Microsoft.AnalysisServices.AdomdServer;

namespace DynamicSecurity
{
    public static class DynamicSecurity
    {
        public static string DynamicVisualTotals(string customData)
        {
            if (customData == "1")
                return "1";

            return "0";
        }

        public static Set DynamicAllowedSet(string customData)
        {
            string mdxEpression;
            Expression exp;
            Set s;

            if (customData == "1")
                mdxEpression = "{[Product].[Category].&[1]}";

            else
                mdxEpression = "{[Product].[Category].&[3], [Product].[Category].&[4]}";

            exp = new Expression(mdxEpression);
            s = exp.CalculateMdxObject(null).ToSet();

            return s;
        }
    }
}

(This ASSP has to be compiled and added to the database first to make the Dynamic Security work. This is describe here in more detail.)

And that’s it!

Once everything is done
1) Deploy the solution to the sever
2) (add ASSP if not already included in the solution)
3) Execute the AMO code (first listing)
we can use Excel to check the results:

FinalResults

 

The used connection strings are as follows:

Provider=MSOLAP.5;Integrated Security=SSPI;Persist Security Info=True;Data Source=localhost;Initial Catalog=DynamicVisualTotals;Roles=DynamicVisualTotals;CustomData=0

 

Provider=MSOLAP.5;Integrated Security=SSPI;Persist Security Info=True;Data Source=localhost;Initial Catalog=DynamicVisualTotals;Roles=DynamicVisualTotals;CustomData=1

Make sure to also use the Roles-property for testing as if you are administrator these privileges would overwrite all other security settings!

 

Additional Notes:

As AMO is just a wrapper for XMLA we could also use XMLA directly to modify our role:

<Alter AllowCreate="true" ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    <Object>
        <DatabaseID>DynamicVisualTotals</DatabaseID>
        <DimensionID>Dim Product</DimensionID>
        <DimensionPermissionID>DimensionPermission</DimensionPermissionID>
    </Object>
    <ObjectDefinition>
        <DimensionPermission xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400" xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400">
            <ID>DimensionPermission</ID>
            <Name>DimensionPermission</Name>
            <RoleID>Role</RoleID>
            <Read>Allowed</Read>
            <AttributePermissions>
                <AttributePermission>
                    <AttributeID>Product Category Name</AttributeID>
                    <AllowedSet>DynamicSecurity.DynamicAllowedSet(CustomData())</AllowedSet>
                    <VisualTotals>DynamicSecurity.DynamicVisualTotals(CustomData())</VisualTotals>
                </AttributePermission>
            </AttributePermissions>
        </DimensionPermission>
    </ObjectDefinition>
</Alter>

 

Please note that XMLA always uses the internal IDs which may not be the same as the name of the objects!

 

In this post I showed how to use AMO and XMLA to modify a security role and use an expression to make the VisualTotals setting dynamic which is not possible using the UI only. This is just a very simplified example but I used this approach on some of my enterprise deployments and it works just fine also for very complex scenarios.

 

The attached zip-file includes everything that is necessary to recreate this example:
– SSAS project
– ASSP project
– AMO project (and XMLA script)
– final Excel with results

DynamicVisualTotals.zip

DAX vs. MDX: DataMembers in Parent-Child Hierarchies

Recently when browsing the MSDN PowerPivot Forums I came across this thread where the question was ask on how to show the value which is directly linked to an element in a parent-child hierarchy instead of the aggregation of all "children". In this post I am going to address this problem finally showing a proper solution.

 

First of all I would like to start with some background to so called "datamembers". The term "datamember" originates from parent-child hierarchies in multidimensional models. It is a dummy-member that is created automatically below each hierarchy-node to hold values that are linked to a node-member directly. This is of course only possible for parent-child hierarchies.

Take the following MDX-Query for example:

SELECT
[Measures].[Sales Amount Quota] ON 0,
Descendants(
    [Employee].[Employees].&[290],
    1,
    SELF_AND_BEFORE) ON 1
FROM [Adventure Works]
WHERE [Date].[Calendar].[Calendar Year].&[2006]

HiddenDatamember

As we can see, the value of "Amy E. Alberts" is not the Sum of its children but has a difference of $108,000.00 This difference is because of the "datamember" that is also a child of every parent-child node which may be hidden. For multidimensional models his behavior can be changed by setting the MembersWithData-Property to "NonLeafDataVisible":

VSProperties

 

Executing the same query again we now get this results:

VisibleDatamember

Now we see "Amy E. Alberts" twice – the node and the datamember.

 

Doing a similar query on a tabular model we will get this result by default:

q4evw0ne

As you can see, we still have the datamember but without any caption. Though, this can of course be changed when you flatten out the hierarchy using DAX by simply replacing BLANK() by "<my datamember caption>" in your calculated columns.

 

Anyway, this is not the topic of this post but just some background information. Now we will focus on how to query data that is directly associated to a node or its corresponding datamember in MDX and DAX.

 

MDX:

For MDX this is quite easy as the datamember can be referenced by simply using ".datamember" of a given hierarchy node:

WITH

MEMBER [Measures].[DataMember] AS (
[Measures].[Sales Amount Quota],
[Employee].[Employees].currentmember.datamember
), FORMAT_STRING = 'Currency'

SELECT
{
[Measures].[Sales Amount Quota],
[Measures].[DataMember]
} ON 0,
Descendants(
    [Employee].[Employees].&[290],
    1,
    SELF_AND_BEFORE) ON 1
FROM [Adventure Works]
WHERE [Date].[Calendar].[Calendar Year].&[2006]

DatamemberValue_MDX

The DataMember-Measure only shows values that are directly linked to the current node of the parent-child hierarchy. This is pretty straight forward and very easy to accomplish.

 

DAX:

In this DAX this problem is a bit more tricky as we do not have that built-in ".datamember" function. Also navigation within hierarchies and parent-child hierarchies itself are not really supported in DAX. Though, there are several blogs out there that describe how to handle parent-child hierarchies by Alberto Ferrari (here) and Kasper de Jonge (here). Based on these solutions we can create our datamember-calculation.

First (and only) thing we need is to add a calculated column that stores the path depth of the current row. This can be achieved by using this formula:

=PATHLENGTH(
    PATH(
        Employee[EmployeeKey],
        Employee[ParentEmployeeKey]
    )
)

 

Using our previous example and selecting "Amy E. Alberts" together with our new column we get this:

CalculatedColumn_Level

As we can see there are two different Levels below "Amy E. Alberts" where Level=3 holds our datamember-value and Level=4 holds all the other values (=real child members). Lets add a calculated measure to help you make the final calculation more understandable:

MinLevel:=MIN(Employee[Level])

Calculation_MinLevel

So, for our [DataMember]-calculation we simply have to extend the context to only include rows where [Level] = [MinLevel]

DataMember:=CALCULATE(
SUM('Sales Quota'[Sales Amount Quota]),
FILTER('Employee',
    'Employee'[Level] = [MinLevel]))

CalculationNotWorking_DAX

Well, obviously our calculation is not returning the expected results. This is because of a very common pitfall which I am also very prone to. Whenever a calculated measure is used within an other calculated measure, a CALCULATE() gets wrapped around it thereby taking over the current context. In our case this means that ‘Employee'[Level] will always be the same as [MinLevel] as [MinLevel] gets evaluated for each row and doing MIN() on only one row of course always returns the value of the current row which we compared it to!

To solve this issue we have to place the original [MinLevel]-calculation directly into our [DataMember]-calculation:

DataMember:=CALCULATE(
SUM('Sales Quota'[Sales Amount Quota]),
FILTER('Employee',
    'Employee'[Level] = MIN(Employee[Level])))

CalculationWorking_DAX

This change finally makes the calculation return the correct results also when browsing using the hierarchy:

DatamemberValue_DAX

 

Even though DAX is not very good in handling hierarchies, the [DataMember]-calculation can be accomplished quite easily. This calculation also works for hierarchies of any depth without needing to adopt it. In terms of performance it may be further improved to the following formula – though the impact will be minor:

DataMember_v2:=CALCULATE(
SUM('Sales Quota'[Sales Amount Quota]),
FILTER(VALUES('Employee'[Level]),
    'Employee'[Level] = MIN(Employee[Level])))

 

Download Final Model (Office 2013!)

Another Post about Calculating New and Returning Customers – Part 2

In my previous post I showed a new approach on how to calculate new (and returning) customers in PowerPivot/tabular using DAX. We ended up with a solution where we added the customers first order date as a calculated column to our customer-table. This column was then linked to our date-table with an inactive relationship. The final calculation used USERELATIONSHIP() to make use of this relationship as follows:

New Customers:=CALCULATE(
COUNTROWS(Customer),
USERELATIONSHIP(Customer[FirstOrderDate], ‘Date’[Date]))

This calculation performs really well as it does not have to touch the fact-table to get the count of new customers. And this is also the issue with the calculation as other filters are not reflected in the calculation:

Issue_Old_vs_New

Take row 2 as an example: we have 8 “Total Customers” of which 12 are “New Customers”. Obviously an error in the calculation. The PivotTable is filtered to Category=”Road Bikes” and we have 8 customers on the 2nd of February that bought a road bike. The “New Customers” calculation on the other hand is not related to the Subcategory and shows 12 as in total there were 12 new customers for all products.

 

To get our calculation working also with other filters we have to somehow relate it to our fact-table. So far we calculated the customers first order date only in the customer table. The customers first order may be related to several fact-rows, e.g. one row for each product the customer bought. Our “New Customers” calculation should only include customers that are active considering also all other filters.

To identify a customers first order in our fact-table we can again use a calculated column and also re-use our previous calculated column in our Customer-table that holds the customers first order date:

=NOT(
ISBLANK(
LOOKUPVALUE(
Customer[CustomerKey],
Customer[CustomerKey],
[CustomerKey],
Customer[FirstOrderDate],
[Order Date]
)))

This returns True for all fact-rows associated with a customers first order and False for all other rows.

The final “New Customers v2” calculation is quite simple then – in addition to the active filters we add a further filter to only select rows that are associated to a customers first order:

New Customers v2:=CALCULATE(
[Total Customers],
‘Internet Sales’[IsCustomersFirstOrder] = TRUE())

 

And this are the results:

Final_Old_vs_NewV2

As you can see there are still differences between “New Customers OLD” and “New Customers v2”. But is this really a problem with the new calculation? Lets analyze the issue taking customer “Desiree Dominguez” where we encounter the first difference as an example:

Issue_Analyzed

“Desiree Dominguez” had her first order on the 22th of June in 2006. So she is actually no “new customer” in 2008. The reason why the old calculation counts her as “new customer” is that it was the first time that she bought a product of subcategory “Road Bikes”. Whether this is correct or not is up to your business definition of a “new customer”. According to my experience it is more likely that “Desiree Dominguez” is not counted as a new customer in 2008 and so the “New Customer v2” actually returns the more accurate results.

 

Additional stuff:

An other option for this calculation is to rank the [Order Date] or [Sales Order Number] for each customer within the fact-table using the calculation below:

=RANKX(
FILTER(
ALL(‘Internet Sales’),
[CustomerKey] = EARLIER([CustomerKey])),
[Order Date],
[Order Date],
1,
DENSE
)

[Order Date] could be replaced by [Sales Order Number]. This makes sense if a customer can have multiple orders per day and you also want to distinguish further by [Sales Order Number]. The new field would also allow new analysis. For example the increase/decrease in sales from the second order compared to the first order and so on.

The “New Customer” calculation in this case would still be similar. We just have to filter on the new calculated column instead:

New Customers v3:=CALCULATE(
[Total Customers],
‘Internet Sales’[CustomersOrderNr] = 1)

 

Download Final Model (Office 2013!)

 

 

The multidimensional model:

The whole logic of extending the fact-table to identify rows that can be associated with a customers first order can also be used in a multidimensional model. Once we prepared the fact-table accordingly the calculations are quite easy. The biggest issues here does not reside in the multidimensional model itself but in the ETL/relational layer as this kind of operation can be quite complex – or better say time-consuming in terms of ETL time.

At this point I will not focus on the necessary ETL steps but start with an already prepared fact-table and highlight the extensions that have to be made in the multidimensional model. The fact-table already got extended by a new column called [IsCustomersFirstOrder] similar to the one we created in tabular using a DAX calculated column. It has a value of 1 for rows associated with a customers first order and 0 for all other rows.

The next thing we have to do is to create a new table in our DSV to base our new dimension on. For simplicity I used this named query:

MD_NamedQuery_Dim

This table is then joined to the new fact-table:

MD_DSV

The new dimension is quite simple as it only contains one attribute:

MD_Dimension

You may hide the whole dimension in the end as it may only be used to calculate our “new customers” and nowhere else and may only confuse the end-user.

 

Once we have added the dimension also to our cube we can create a new calculated measure to calculate our “new customers” as follows:

CREATE MEMBER CURRENTCUBE.[Measures].[New Customers] AS (
[Measures].[Customer Count],
[Is Customers First Order].[Is Customers First Order].&[1]
), ASSOCIATED_MEASURE_GROUP = ‘Internet Customers’
, FORMAT_STRING = ‘#,##0’;

The calculation is based on the existing [Customer Count]-measure which uses DistinctCount-aggregation. Similar to DAX with just extend the calculation by further limiting the cube-space where “Is customers First Order” = 1.

This approach also allows you to create aggregations if necessary to further improve performance. So this is probably also the best way in terms of query-performance to calculate the count of new customers in a multidimensional model.

DAXMD and DefaultMembers

With Power View for Multidimensional Models – Preview Microsoft recently made the first SQL Server version available that allows you to query your multidimensional models using DAX, or the be exactly, using DAXMD. As tabular and multidimensional model are fundamentally different in terms of their underlying data structures , there are also some differences in how to query them. Jason Thomas already blogged about some of those differences here and showed how to query attributes in DAXMD. In this blog I will focus on how DefaultMembers are handled in DAXMD.

As you probably know, there is no concept of DefaultMembers in tabular models but they are essential for multidimensional models. In most cases the DefaultMember is the All-Member of the hierarchy. This is also no issue for tabular models but it gets more tricky if there is a DefaultMember defined in the multidimensional model and you query it using a query language that is designed for tabular models like DAXMD.

For my tests I used the AdventureWorks Multidimensional Model (enterprise) from SQL Server 2008R2 which can be downloaded here.

This model contains a Dimension called [Scenario] which is Non-Aggregateable and also has a DefaultMember defined. To get a list of all available scenarios you could simply write the following MDX:

SELECT
{[Measures].[Amount]} ON 0,
[Scenario].[Scenario].[Scenario].allmembers ON 1
FROM [Adventure Works]

The result looks like this:
image

To get the similar result in DAX you would usually write a query like this:

EVALUATE
    SUMMARIZE(
        'Scenario',
        'Scenario'[Scenario],
        "Amount", 'Financial Reporting'[Amount])

If you run this query you get an error that says, that

Column [Scenario] is part of composite key, but not all columns of the composite key are included in the expression or its dependent expression.

This is because the our [Scenario]-attribute has a NameColumn that is different from it’s KeyColumn. DAXMD handles this as it would be a composite key as the NameColumn may not be unique without the KeyColumn. So you always also have to include the Key-Column in your query. This is usually done by using the following syntax: ‘MyTable’[MyColumn.Key0] where Key0 refers to the first column of the composite key. So lets change our query and see what happens:

EVALUATE
    SUMMARIZE(
        'Scenario',
        'Scenario'[Scenario.Key0],
        "Amount", 'Financial Reporting'[Amount])

Now we get a different saying that the column [Scenario.Key0] could not be found in the source table. That’s very strange as this syntax works just well for all other attributes that have different Key- and NameColumns.

(I posted this issue on Connect as for me this is an inconsistent behavior – feel free to vote for it here)

So, to further investigate into this problem we first have to check what columns actually exist in our ‘Scenario’ table. This can be done using this query:

EVALUATE 'Scenario'

image

We notice several things here:
1) the column is called [Scenario.UniqueName] and opposed to [Scenario.Key0]
2) the resultset only contains 1 row
3) this one row is the DAXMD representation of our DefaultMember

Ok, now that we know how to reference the column we can adopt our query accordingly:

EVALUATE
    SUMMARIZE(
        'Scenario',
        'Scenario'[Scenario.UniqueName],
        "Amount", 'Financial Reporting'[Amount])

image

The query works now but still only shows only one row – our DefaultMember. This is the special behavior for DAXMD and how DefaultMembers are handled. It internally applies a filter on the columns where DefaultMembers are set. In this case, whenever you query your ‘Scenario’ table, it is internally handled like

FILTER(
    'Scenario',
    'Scenario'[Scenario.UniqueName] = "[Scenario].[Scenario].&[1]")

 

This internal filter can be removed like any other filter on a table or a column using the same functions that we would usually use – ALL(), ALLEXCEPT(), etc.

So to get our DAXMD equivalent to our MDX query we have to remove the filter from our ‘Scenario’ table:

EVALUATE
    SUMMARIZE(
        ALL('Scenario'),
        'Scenario'[Scenario.UniqueName],
        'Scenario'[Scenario],
        "Amount", 'Financial Reporting'[Amount])

image

Please also note that the last two rows [Budget Variance] and [Budget Variance %] are calculated members that are created in the cubes MDX script and they show up just like regular rows! So calculated members are seamlessly integrated into DAXMD – awesome! Especially  if you consider that DAXMD is NOT translated into MDX but is natively integrated into the engine!

In this blog I showed how DefaultMembers defined in multidimensional models are handled in a tabular query language like DAXMD and what pitfalls I encounter. I hope this blogs helps you to better understand these internals and not to make the same mistakes again that I already did.