Excel CUBE-Functions and MDX UniqueNames

Two weeks ago at the German SQL Server Conference 2015 I was at Peter Myer’s session about Mastering the CUBE Functions in Excel. (PS: Peter is also speaking on our upcoming SQLSaturday #374 in Vienna next week and at PASS SQLRally in Copenhagen the week after). After his session we had a further discussion about this topic and our experiences on how to use Excels CUBE-functions in order to build nice Dashboards with native Excel functionalities that also work with e.g. Excel Services. Its always great to exchange with people that share the same passion on he same topic! One thing we both agreed on that is missing currently is a way to get the MDX UniqueName of something that is selected in a slicer, filter or simply in a cell using CUBEMEMBER-function. I once used a special Cube Measure which was created in MDX Script which returned the UniqueName of a given member that was selected together with this special measure. For this to work with Excel you need to know how Excel builds the MDX when querying cube values using CUBEVALUE-function. Here is a little example:

This produces the following MDX query:

1. SELECT
2. {
3.     (
4.         [Measures].[Internet Sales Amount],
5.         [Product].[Category].&[1]
6.     )
7. } ON 0
9. CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

So it basically creates a tuple that contains everything you pass into the CUBEVALUE-Function as a parameter. Knowing this we can create a calculated measure to get the MDX UniqueName of this tuple using MDX StrToTuple()- and MDX AXIS()-function:

1. MEMBER [Measures].[Excel TupleToStr] AS (
2. TupleToStr(axis(0).item(0))
3. )

Replacing the [Measures].[Internet Sales Amount] of our initial CUBEVALUE-function with this new measure would return this to Excel:

1. ([Measures].[Internet Sales Amount],[Product].[Category].&[1])

Ok, so far so good but nothing really useful as you need to hardcode the member’s UniqueName into the CUBEVALUE-function anyway so you already know the UniqueName.
However, this is not the case if you are dealing with Pivot Table Page Filters and/or Slicers! You can simply refer to them within the CUBEVALUE-function but you never get the UniqueName of the selected item(s). Well, at least not directly! But you can use the approach described above, using an special MDX calculated measure, to achieve this as I will demonstrate on the next pages.

Calculated measures can only be created using the Pivot Table interface but can also be used in CUBE-functions. So first thing you need to do is to create a Pivot Table and add a new MDX Calculated Measure:

!Caution! some weird MDX coming !Caution!

You may wonder, why such a complex MDX is necessary and what it actually does. What it does is the following: Based on the example MDX query that Excel generates (as shown above) this is a universal MDX that returns the MDX UniqueName of any other member that is selected together with our measure using the CUBEVALUE-function. It also removes the UniqueName of the measure itself so the result can be used again with any other measure, e.g. [Internet Sales Amount]
The reason why it is rather complex is that Excel may group similar queries and execute them as a batch/as one query to avoid too many executions which would slow down the overall performance. So we cannot just reference the first element of our query as it may belong to any other CUBEVALUE-function. This MDX deals with all this kinds of issues.

The MDX above allows you to specify only two additional filters but it may be extended to any number of filters that you pass in to the CUBEMEMBER-function. This would be the general pattern:

1. MID(
2.   IIf(axis(0).item(0).count > 0 AND
3.         NOT(axis(0).item(0).item(0).hierarchy IS [Measures]),
4.     "," + axis(0).item(0).item(0).hierarchy.currentmember.uniquename,
5.     "")
6. + IIf(axis(0).item(0).count > 1 AND
7.         NOT(axis(0).item(0).item(1).hierarchy IS [Measures]),
8.     "," + axis(0).item(0).item(1).hierarchy.currentmember.uniquename,
9.     "")
10. + IIf(axis(0).item(0).count > n AND
11.         NOT(axis(0).item(0).item(n).hierarchy IS [Measures]),
12.     "," + axis(0).item(0).item(n).hierarchy.currentmember.uniquename,
13.     "")
14. , 2)

After creating this measure we can now use it in our CUBE-functions in combination with our filters and slicers:

You may noted that I had to use CUBERANKEDMEMBER here. This is because filters and slicers always return a set and if we would pass in a set to our CUBEVALUE function a different MDX query would be generated which would not allow us to extract the single UniqueNames of the selected items using the approach above (or any other MDX I could think of). So, this approach currently only works with single selections! I hope that the Excel team will implement a native function to extract the UniqueName(s) of the selected items in the future to make this workaround obsolete!

Once we have our UniqeName(s) we can now use them in e.g. a CUBESET-function to return the Top 10 days for a given group of product (filter) and the selected year (slicer):

And that’s it!

So why is this so cool?

• It works with SSAS (multidimensional and tabular) and Power Pivot as Excel still uses MDX to query all those sources. It may also work with SAP HANA’s ODBO connector but I have not tested this yet!
• It does not require any VBA which would not work in Excel Services – this solution does!
• The calculation is stored within the Excel Workbook so it can be easily shared with other users!
• There is no native Excel functionality which would allow you to create a simple Top 10 report which works with filters and slicers as shown above or any more complex dynamic report/dashboard with any dynamic filtering.

So no more to say here – Have fun creating your interactive Excel web dashboards!

Note: You may also rewrite any TOPCOUNT expression and use the 4th and 5h parameter of the CUBESET-function instead. This is more native and does not require as much MDX knowledge:
However, if you are not familiar with MDX, I highly recommend to learn it before you write any advanced calculations as show above as otherwise the results might be a bit confusing in the beginning! Especially if you filter and use TOPCOUNT on the same dimension!

SSAS Dynamic Named Sets in Calculated Members

Recently at one of my customers we were doing some performance tuning on a SSAS Multidimensional cube and I came across something I have not seen before yet in my lifetime as a SSAS developer. Even the simplest select queries where running for several seconds, even if executed on warm cache! So first I thought this may be related to some SCOPE assignments overwriting values etc. Using my MdxScriptDebugger I could easily identify that none of the SCOPE assignments had any impact on the actual result of the query. However, what the result of the MdxScriptDebugger trace also revealed was that the query-time increased after a calculated member was created. Important to note here is that this specific member was NOT used in the test-query at all! Investigating into the calculated member turned out that it was using a Dynamic Named Set.

Dynamic Named Sets are usually used if you want a set to be re-evaluated in the current context opposed to Static Named Sets which are only evaluated once during creation. For example you can create a Dynamic Named Set for your Top 10 customers. Changing the Year or Country would cause a re-evaluation of the set and different customers are returned for your filter selections. These type of calculated sets is usually not a problem.

Another reason to used Dynamic Named Sets is to deal with Sub-Selects. Some client tools, foremost Microsoft Excel Pivot Tables, use Sub-Selects to deal with multiple selections on the same attribute. Lets do a little example on how to use Dynamic Named Sets here. Assume you need to calculate the average yearly sales for the selected years. This may sound very trivial at first sight but can be very tricky. In order to calculated the yearly average we first need to calculated how many years are in the currently selected:

1. CREATE MEMBER CURRENTCUBE.[Measures].[CountYears_EXISTING] AS (
2. COUNT(EXISTING [Date].[Calendar Year].[Calendar Year].members)
3. );

However, this does not work if Sub-Selects are used in the query:

The calculated member returns “6” (the overall number of years) instead of “2” (the actually selected number of years). The issue here is that the calculation is not aware of any Sub-Select or filters within the Sub-Select as it is executed only outside of the Sub-Select.

To work around this issue you can create a Dynamic Name Set and refer to it in your calculated member:

1. CREATE DYNAMIC SET [ExistingYears] AS {
2. EXISTING [Date].[Calendar Year].[Calendar Year].members
3. };
4.
5. CREATE MEMBER CURRENTCUBE.[Measures].[CountYears_DynamicSet] AS (
6. COUNT([ExistingYears])
7. );

Now we get the correct results for our Count of Years calculation and could simply divide our Sales Amount by this value to get average yearly sales. The reason for this is that Dynamic Named Sets are also evaluated within the Sub-Select and therefore a COUNT() on it returns the correct results here.

So this technique is quite powerful and is also the only feasible workaround to deal with this kind of problem. But as I initially said, this can also cause some performance issues!

1. CREATE DYNAMIC SET [ExpensiveSet] AS {
2. Exists(
3.     [Product].[Category].[Category].members,
4.     Filter(
5.         Generate(
7.             CrossJoin(
8.                 {[Date].[Calendar].CURRENTMEMBER},
10.                     Order(
11.                         [Customer].[Customer].[Customer].MEMBERS,
12.                         [Measures].[Internet Sales Amount],
13.                         BDESC),
14.                     10000))),
15.         [Measures].[Internet Order Quantity] > -1
16.     ),
17.     'Internet Sales'
18. )
19. };
20.
21. CREATE MEMBER CURRENTCUBE.[Measures].[UnusedCalc] AS (
22. COUNT([ExpensiveSet])
23. );

The [ExpensiveSet] is just a Dynamic Named Set which needs some time to be evaluated and the [UnusedCalc] measure does a simple count over the [ExpensiveSet]. Having these calculations in place you can now run any query against your cube and will notice that even the simplest query now takes some seconds to execute even if the new calculated member is not used:

1. SELECT
2. [Measures].[Internet Sales Amount] ON 0

I am quite sure that this behavior is related to how Dynamic Named Sets are evaluated and why they also work for Sub-Selects. However, I also think that calculations that are not used in a query should not impact the results and/or performance of other queries!

I just created a bug on Connect in case you want to vote it:
https://connect.microsoft.com/SQLServer/feedback/details/1049569

I know that Dynamic Named Sets in combination with calculated members is a not a widely used technique as I guess most developers are not even aware of its power. For those who are, please keep in mind that these kind of calculations get evaluated for every query which can be crucial if your Dynamic Named Set is expensive to calculate! This has also impact on meta-data queries!

MDX Script Debugger – Beta-Release

I recently built a tool which should help to debug the MDX scripts of an Analysis Services cube in order to track down formula engine related performance issues of a cube. As you probably know most of the performance issues out there are usually caused by poorly or wrong written MDX scripts. This tool allows you to execute a reference query and highlights the MDX script commands that are effectively used when the query is run. It provides the overall timings and how long each additional MDX script command extended the execution time of the reference query. The results can then either be exported to XML for further analysis in e.g. Power BI or a customized version of the MDX script can be created and used to run another set of tests.

The tool is currently in a beta state and this is the first official release – and also my first written tool that I share publicly so please don’t be too severe with your feedback   – just joking every feedback is good feedback!

Below is a little screenshot which shows the results after  the reference query is executed. The green lines are effectively used by the query whereas the others do not have any impact on the values returned by the query.

A list of all features, further information and also the source code can be found at the project page on codeplex:
https://mdxscriptdebugger.codeplex.com/
https://mdxscriptdebugger.codeplex.com/releases

Looking forward to your feedback and hope it helps you to track down performance issues of your MDX Scripts!

Recently I had to setup an Analysis Services cube and expose it to external users. This is usually done by using Internet Information Server (IIS) and creating a new WebSite which hosts msmdpump.dll. This DLL more or less wraps XMLA commands inside HTTP thus allowing external users to access the cube via HTTP. Besides Windows Authentication this setup also allows Basic Authentication and so external users can simply connect by specifying Username and Password in e.g. Excel when connecting to the cube:

There are already a lot of whitepapers out there which describe how to set things up correctly. Here are just some examples:
– MSDN: http://msdn.microsoft.com/en-us/library/gg492140.aspx

They provide very useful information and you should be familiar with the general setup before proceeding here or using the final PowerShell script.

The PowerShell script basically performs the following steps:

1. Create a local folder as base for your WebSite in IIS
2. Copy SSAS ISAPI files (incl. msmdpump.dll) to the folder
3. Create and Configure an IIS AppPool
4. Create and Configure a IIS WebSite
5. Add and enable an ISAPI entry for msmdpump.dll
6. Configure Authentication
7. Configure Default Document
8. Update connection information to SSAS server

I tested it successfully with a clean installation of IIS 8.0 (using applicationhost.config.clean.install). In case you already have other WebSites running you may still consider doing the steps manually or adopting the script if necessary. The script is written not to overwrite any existing Folders, WebSites, etc. but you never know.

So here is my final script:

1. #Import Modules
3.
4. # change these settings
5. \$iisSiteName = "OLAP"
6. \$iisPort = "8000"
7. \$olapServerName = "server\instance"
8.
9. # optionally also change these settings
10. \$isapiFiles = "c:\Program Files\Microsoft SQL Server\MSAS11.MSSQLSERVER\OLAP\bin\isapi\*"
11. \$iisAbsolutePath = "C:\inetpub\wwwroot\" + \$iisSiteName
12. \$iisAppPoolName = \$iisSiteName + "_AppPool"
13. \$iisAppPoolUser = "" #default is ApplicationPoolIdentity
15. \$iisAuthAnonymousEnabled = \$false
16. \$iisAuthWindowsEnabled = \$true
17. \$iisAuthBasicEnabled = \$true
18. \$olapSessionTimeout = "3600" #default
19. \$olapConnectionPoolSize = "100" #default
20.
21. if(!(Test-Path \$iisAbsolutePath -pathType container))
22. {
23.     #Creating Directory
24.     mkdir \$iisAbsolutePath  | Out-Null
25.
26.     #Copying Files
27.     Write-Host -NoNewline "Copying ISAPI files to IIS Folder … "
28.     Copy -Path \$isapiFiles -Destination \$iisAbsolutePath -Recurse
29.     Write-Host " Done!" -ForegroundColor Green
30. }
31. else
32. {
33.     Write-Host "Path \$iisAbsolutePath already exists! Please delete manually if you want to proceed!" -ForegroundColor Red
34.     Exit
35. }
36.
37. #Check if AppPool already exists
38. if(!(Test-Path \$("IIS:\\AppPools\" + \$iisAppPoolName) -pathType container))
39. {
40.     #Creating AppPool
41.     Write-Host -NoNewline "Creating ApplicationPool \$iisAppPoolName if it does not exist yet … "
42.     \$appPool = New-WebAppPool -Name \$iisAppPoolName
43.     \$appPool.managedRuntimeVersion = "v2.0"
44.     \$appPool.managedPipelineMode = "Classic"
45.
46.     \$appPool.processModel.identityType = 4 #0=LocalSystem, 1=LocalService, 2=NetworkService, 3=SpecificUser, 4=ApplicationPoolIdentity
48.
49.     if (\$iisAppPoolUser -ne "" -AND \$iisAppPoolPassword -ne "") {
50.         Write-Host
51.         Write-Host "Setting AppPool Identity to \$iisAppPoolUser"
52.         \$appPool.processmodel.identityType = 3
55.     }
56.     \$appPool | Set-Item
57.     Write-Host " Done!" -ForegroundColor Green
58. }
59. else
60. {
61.     Write-Host "AppPool \$iisAppPoolName already exists! Please delete manually if you want to proceed!" -ForegroundColor Red
62.     Exit
63. }
64.
65. #Check if WebSite already exists
66. \$iisSite = Get-Website \$iisSiteName
67. if (\$iisSite -eq \$null)
68. {
69.     #Creating WebSite
70.     Write-Host -NoNewline "Creating WebSite \$iisSiteName if it does not exist yet … "
71.     \$iisSite = New-WebSite -Name \$iisSiteName -PhysicalPath \$iisAbsolutePath -ApplicationPool \$iisAppPoolName -Port \$iisPort
72.     Write-Host " Done!" -ForegroundColor Green
73. }
74. else
75. {
76.     Write-Host "WebSite \$iisSiteName already exists! Please delete manually if you want to proceed!" -ForegroundColor Red
77.     Exit
78. }
79.
80. #Ensuring ISAPI CGI Restriction entry exists for msmdpump.dll
81. if ((Get-WebConfiguration "/system.webServer/security/isapiCgiRestriction/add[@path='\$iisAbsolutePath\msmdpump.dll']") -eq \$null)
82. {
83.     Write-Host -NoNewline "Adding ISAPI CGI Restriction for \$iisAbsolutePath\msmdpump.dll … "
84.     Add-WebConfiguration "/system.webServer/security/isapiCgiRestriction" -PSPath:IIS:\\  -Value @{path="\$iisAbsolutePath\msmdpump.dll"}
85.     Write-Host " Done!" -ForegroundColor Green
86. }
87. #Enabling ISAPI CGI Restriction for msmdpump.dll
88. Write-Host -NoNewline "Updating existing ISAPI CGI Restriction … "
89. Set-WebConfiguration "/system.webServer/security/isapiCgiRestriction/add[@path='\$iisAbsolutePath\msmdpump.dll']/@allowed" -PSPath:IIS:\\ -Value "True"
90. Set-WebConfiguration "/system.webServer/security/isapiCgiRestriction/add[@path='\$iisAbsolutePath\msmdpump.dll']/@description" -PSPath:IIS:\\ -Value "msmdpump.dll for SSAS"
91. Write-Host " Done!" -ForegroundColor Green
92.
93.
94. #Adding ISAPI Handler to WebSite
95. Write-Host -NoNewline "Adding ISAPI Handler … "
96. Add-WebConfiguration /system.webServer/handlers -PSPath \$iisSite.PSPath -Value @{name="msmdpump"; path="*.dll"; verb="*"; modules="IsapiModule"; scriptProcessor="\$iisAbsolutePath\msmdpump.dll"; resourceType="File"; preCondition="bitness64"}
97. Write-Host " Done!" -ForegroundColor Green
98.
99. #enable Windows and Basic Authentication
100. Write-Host -NoNewline "Setting Authentication Providers … "
101. #need to Unlock sections first
102. Set-WebConfiguration /system.webServer/security/authentication/anonymousAuthentication  MACHINE/WEBROOT/APPHOST -Metadata overrideMode -Value Allow
103. Set-WebConfiguration /system.webServer/security/authentication/windowsAuthentication  MACHINE/WEBROOT/APPHOST -Metadata overrideMode -Value Allow
104. Set-WebConfiguration /system.webServer/security/authentication/basicAuthentication  MACHINE/WEBROOT/APPHOST -Metadata overrideMode -Value Allow
105.
106. Set-WebConfiguration /system.webServer/security/authentication/anonymousAuthentication -PSPath \$iisSite.PSPath -Value @{enabled=\$iisAuthAnonymousEnabled}
107. Set-WebConfiguration /system.webServer/security/authentication/windowsAuthentication -PSPath \$iisSite.PSPath -Value @{enabled=\$iisAuthWindowsEnabled}
108. Set-WebConfiguration /system.webServer/security/authentication/basicAuthentication -PSPath \$iisSite.PSPath -Value @{enabled=\$iisAuthBasicEnabled}
109. Write-Host " Done!" -ForegroundColor Green
110.
112. Write-Host -NoNewline "Adding Default Document msmdpump.dll … "
113. Add-WebConfiguration /system.webServer/defaultDocument/files -PSPath \$iisSite.PSPath -atIndex 0 -Value @{value="msmdpump.dll"}
114. Write-Host " Done!" -ForegroundColor Green
115.
116. #Updating OLAP Server Settings
117. Write-Host -NoNewline "Updating OLAP Server Settings … "
118. [xml]\$msmdpump = Get-Content "\$iisAbsolutePath\msmdpump.ini"
119. \$msmdpump.ConfigurationSettings.ServerName = \$olapServerName
120. \$msmdpump.ConfigurationSettings.SessionTimeout = \$olapSessionTimeout
121. \$msmdpump.ConfigurationSettings.ConnectionPoolSize = \$olapConnectionPoolSize
122. \$msmdpump.Save("\$iisAbsolutePath\msmdpump.ini")
123. Write-Host " Done!" -ForegroundColor Green
124.
125. Write-Host "Everything done! "
126. Write-Host "The SSAS server can now be accessed via http://\$env:computername`:\$iisPort"

The process of setting up HTTP connectivity is the same for Analysis Services Multidimensional and Tabular so the script works for both scenarios, just change the server name accordingly.

Hiding Dimension Details in Analysis Services

An Analysis Services cube is usually accessed by a wide variety of people all of them having different roles in the business. For example Product Managers, Sales Representatives, Key Account Managers and so on. Sometimes it is necessary to hide detailed information of a given dimension or attribute from a certain user or role. In this post I will show you how this can be accomplished.

Lets take a simple example and assume we have the role “ProductManager” which must not be allowed to see any specific details of a single customer but must be able to see aggregation levels above the customer like Gender, Education, etc.

This can be accomplished quite easily by changing the security settings of our Product Manager role. We navigate to “Dimension Data” > Dimension “Customer” > Attribute “Customer”.
Next go to the “Advanced” tab and manually specify your Allowed Set as:
{[Customer].[Customer].[All Customers]}

One important thing here is to uncheck “Enable Visual Totals” as shown above, otherwise you will not see any data at all! An end-user which owns the role ProductManager would see the following result in Excel:

He sees aggregated values on Education and Gender Level but can not see details of any single customer.

A scenario that’s a bit more complex to achieve is to hide not only one attribute but all attributes of a given dimension. Of course, we could use the approach as described above and do it for each and every attribute in the dimension. This would technically work, but its timely to implement and also hard to maintain (e.g. if new attributes are added). Ideally we only want to define it once in a single place and it should work for all attributes, existing and also new ones. We can make use auf Auto-Exists within our security role here. Auto-Exists basically reflect the security defined on one attribute to all other attributes in the same dimension. So if you are restricted to [Country]=”Austria” you can only see Cities in Austria and for Continents you would only see Europe but not America.

Having this in mind we could create a dummy-customer (which has no associated facts) on which we put our security. Instead of creating a dummy-customer manually in our relational DB we can also make use of the special UnknownMember which can be activated for any dimension. Doing this SSAS automatically creates a dummy-member for us which can be used for our security purposes:

Note that we have set the UnkownMemberName to “No Details Available” to inform the user that there are no details available for him/her.
The role itself would then specify the UnknownMember in the Allowed Set as:
{[Customer].[Customer].UNKNOWNMEMBER}

Again, make sure that “Enable Visual Totals” is unchecked!

Now our Excel looks as follows:

(Note: “Show items with no data on rows” was enabled in PivotTable options here)

Using any attribute or hierarchy of our secured dimension in a filter would show this result:

One last option you may consider is to set the Visibility of the UnknownMember to “Hidden” thus also hiding the UnknownMember and only revealing the All-members which was our initial requirement.  However, I think using the UnknownMember’s name to provide some further information for the end-user makes quite sense e.g. naming it “No Details Available due to Security Restrictions” or “Restricted by Security”.
This is of course not possible if you create your dummy-customer manually that’s why I prefer using the UnkownMember.

Note1:
During my tests I came across a bug which when you change the Visibility of the UnkownMember after you have already defined security makes you end up seeing all details. Even a redeploy or ProcessFull do not solve the problem. What you need to do is to explicitly change the definition of the role (e.g. by adding a blank to the Allowed Set) and redeploy the role afterwards.

Note2:
None of the above works for SSAS tabular as it does not have the concept of (disabled) VisualTotals which is essential for this solution to work!

Analysis Services Security: Multiple Roles in Tabular vs. Multidimensional – Part 2

In one of my recent posts I highlighted how multiple security roles are handled in tabular opposed to multidimensional Analysis Services models. In this post I will go into more detail and focus on how the security is evaluated for both models.

I would like to continue using the same example as in the last post with the following roles:
“Bikes” – is restricted to [ProductCategory] = “Bikes”
“Brakes” – is restricted to [ProductSubCategory] = “Brakes”
“DE” – is restricted to [Country] = “DE”

I used the following MDX query to test both, the tabular and the multidimensional model:

1. SELECT
2. NON EMPTY {[Geography].[Country Region Name].[Country Region Name].MEMBERS} ON 0,
3. NON EMPTY {[Product].[Product Category Name].[Product Category Name].MEMBERS} ON 1
4. FROM [Model]
5. WHERE (
6. [Measures].[Reseller Total Sales]
7. )

The last thing I mentioned was how the combination of roles “Bikes” and “DE” could be expressed in SQL:

1. SELECT
2.     [ProductCategory],
3.     [Country],
4.     SUM([Reseller Sales])
5. FROM <table>
6. WHERE [ProductCategory] = ‘Bikes’
7.     OR [Country] = ‘Germany’
8. GROUP BY
9.     [Product Category],
10.     [Country]

When running the previous MDX query on our tabular model using roles “Bikes” and “DE” we will see a very similar query being executed against our xVelocity Storage Engine:
(Note: There are also some SE engine queries executed to get the elements for rows and columns but the query below is the main query)

1. SELECT
2. [Geography_1fa13899-0770-4069-b7cb-5ddf22473324].[EnglishCountryRegionName],
3. [Product_11920c93-05ae-4f1c-980e-466dfbcfca2a].[CalculatedColumn1 1],
4. SUM([Reseller Sales_fc635e72-28dc-4156-80d5-43b805f8df1c].[SalesAmount])
5. FROM [Reseller Sales_fc635e72-28dc-4156-80d5-43b805f8df1c]
6. LEFT OUTER JOIN [Reseller_d52b9c6f-8d2d-4e23-ae4c-2fc57c1d968a]
7.         ON [Reseller Sales_fc635e72-28dc-4156-80d5-43b805f8df1c].[ResellerKey]
8.             =[Reseller_d52b9c6f-8d2d-4e23-ae4c-2fc57c1d968a].[ResellerKey]
9. LEFT OUTER JOIN [Geography_1fa13899-0770-4069-b7cb-5ddf22473324]
10.     ON [Reseller_d52b9c6f-8d2d-4e23-ae4c-2fc57c1d968a].[GeographyKey]
11.         =[Geography_1fa13899-0770-4069-b7cb-5ddf22473324].[GeographyKey]
12. LEFT OUTER JOIN [Product_11920c93-05ae-4f1c-980e-466dfbcfca2a]
13.     ON [Reseller Sales_fc635e72-28dc-4156-80d5-43b805f8df1c].[ProductKey]
14.         =[Product_11920c93-05ae-4f1c-980e-466dfbcfca2a].[ProductKey]
15. WHERE
16. (COALESCE((PFDATAID( [Product_11920c93-05ae-4f1c-980e-466dfbcfca2a].[CalculatedColumn1 1] ) = 6))
17.     OR
18. COALESCE((PFDATAID( [Geography_1fa13899-0770-4069-b7cb-5ddf22473324].[CountryRegionCode] ) = 5)));

Well, not really readable so lets make it a bit nicer by removing those ugly GUIDs, etc:

1. SELECT
2.     [Geography].[EnglishCountryRegionName],
3.     [Product].[ProductCategory],
4.     SUM([Reseller Sales].[SalesAmount])
5. FROM [Reseller Sales]
6. LEFT OUTER JOIN [Reseller]
7.     ON [Reseller Sales].[ResellerKey] = [Reseller].[ResellerKey]
8. LEFT OUTER JOIN [Geography]
9.     ON [Reseller].[GeographyKey] = [Geography].[GeographyKey]
10. LEFT OUTER JOIN [Product]
11.     ON [Reseller Sales].[ProductKey] = [Product].[ProductKey]
12. WHERE [Product].[ProductCategory] = "Bikes"
13. OR [Geography].[CountryRegionCode] = "Germany";

This looks very similar to our SQL query. The special thing about it is the WHERE clause which combines the restrictions of both roles using OR which is then propagated also to our [Reseller Sales] fact table and that’s the reason why we see what we want and expect to see – all sales that were either made with “Bikes” OR made in “Germany”:

Another very important thing to note and remember here is that the security restrictions get propagated into and are evaluated within the query. This is done for each and every query (!) which is usually not a problem but may become crucial if you use dynamic security.
To test this with dynamic security I introduced a new role called “CustData” which is going to replace our “Bikes” for this test. It is restricted on table ‘Product’ as:

1. =([Product Category Name] = IF( CUSTOMDATA() = "1", "Bikes", "Clothing"))

So instead of using the connectionstring “…;Roles=Bikes,DE; …” I now used “…;Roles=CustData,DE;CustomData=1; …” which results in the exact same results of course. However, the query now changed to the following (already beautified) xVelocity query:

1. SELECT
2.     [Geography].[EnglishCountryRegionName],
3.     [Product].[ProductCategory],
4.     SUM([Reseller Sales].[SalesAmount])
5. FROM [Reseller Sales]
6. LEFT OUTER JOIN [Reseller]
7.     ON [Reseller Sales].[ResellerKey] = [Reseller].[ResellerKey]
8. LEFT OUTER JOIN [Geography]
9.     ON [Reseller].[GeographyKey] = [Geography].[GeographyKey]
10. LEFT OUTER JOIN [Product]
11.     ON [Reseller Sales].[ProductKey] = [Product].[ProductKey]
12. WHERE [Product].\$ROWFILTER IN '0x000000…000000000000000000000fffff00000000000ffffffff'));
13. OR [Geography].[CountryRegionCode] = "Germany";

Instead of using a direct filter on [ProductCategory] we now see a filter on \$ROWFILTER ?!?

### ASSUMPTIONS START ###
I have to admit that I am currently not entirely sure what this means but I assume the following:
Preceding the main query another xVelocity query is executed which is important for us:

1. SELECT
2. [Product].[RowNumber],
3. [Product].[ProductCategory],
4. COUNT()
5. FROM [Product];

This query fetches each [RowNumber] and its associated [ProductCategory]. Internally the [RowNumber] column is created for every table. This is related to the columnar storage that xVelocity uses. Elaborating this in detail would go far beyond the context of this blog post. For more details on the RowNumber-column please refer too http://msdn.microsoft.com/en-us/library/hh622558(v=office.15).aspx which describes the Excel data model which is actually Power Pivot and therefore also applies to Tabular. (In general this link contains a lot of in-depth information on the tabular data model and the columnar storage concepts!)

I further assume that our security-expression is then evaluated against this temporary table to create an bitmap index of which rows match the security-expression and which don’t. This result is then applied to our main query which using the WHERE clause  [Product].\$ROWFILTER IN ‘0x0000….’
For all subsequent queries the above query on [RowNumber] and [ProductCategory] is not executed again so I assume that the bitmap index gets cached internally by Analysis Services. I further assume that if the bitmap index gets cached it is also shared between users belonging to the same role which would be similar to multidimensional models.
### ASSUMPTIONS END ###

So the important take-away for tabular is that the security gets propagated into the query and down to the fact table. Combining multiple roles on this level using OR delivers the expected results.

For multidimensional models this is slightly different. You can define security on either the Database Dimension (which gets inherited down to all Cube Dimension) or you can define security on the Cube Dimension directly. Defining security on the Database Dimension already makes it very clear that the security restrictions are independent of any fact table/measure group. A Database Dimension may be used in several cubes so the engine cannot know in advance which measure group to use. Security for multidimensional models is defined on the multidimensional space defined by that dimension. If one role is not restricted on a dimension at all, the user will always see the whole dimension and its associated values, even if a second role would restrict that dimension. And this causes unexpected results as the user may see the whole cube.
In terms of SQL the query could be expressed as:

1. SELECT
2.     [ProductCategory],
3.     [Country],
4.     SUM([Reseller Sales])
5. FROM <table>
6. WHERE ( [ProductCategory] = 'Bikes' OR 1 = 1)
7.     OR ( 1 = 1  OR [Country] = 'Germany')
8. GROUP BY
9.     [Product Category],
10.     [Country]

The left side of the inner OR statements represents the role “Bikes” whereas the right part represents the “DE” role. It should be very obvious that due to the combination of both roles you finally see everything without any restriction!

Another important thing to point out is that security for multidimensional models is only evaluated once and not for every query. So even if you have complex dynamic security its evaluation only hits you once for the first user that connects to a role and is cached for all queries of that user and also shared with other users belonging to that role.

I hope this gives some more insights on how tabular and multidimensional handle multiple roles and their fundamental differences!

Analysis Services Security: Multiple Roles in Tabular vs. Multidimensional

In terms of security tabular and multidimensional models of SQL Server Analysis Services are very similar. Both use Roles to handle specific security settings. Those Roles are then assigned to users and groups.  This is very trivial if a user only belongs to one Role – the user is allowed to see what is specified in the Role. But it can become very tricky if a user belongs to more than one role.

For both, tabular and multidimensional security settings of multiple roles are additive. This means that you will not see less if you are assigned multiple roles but only more. Lets assume we have the following Roles:
“Bikes” – is restricted to [ProductCategory] = “Bikes”
“Brakes” – is restricted to [ProductSubCategory] = “Brakes”
“DE” – is restricted to [Country] = “DE”

A user belonging to Roles “Bikes” and “Brakes” will see all products that belong to “Bikes” and all products that belong to “Brakes”. This is OK and returns the expected results as  both roles secure the same dimension/table. This applies to tabular and also multidimensional.

However, if roles secure different dimensions/tables it gets a bit more tricky. A user may belong to Roles “Bikes” and “DE”. For multidimensional this is a real problem as it finally result in the user seeing the whole cube! This is “by design” and can be explained as follows:
Role “Bikes” does not have any restriction on [Country] so all countries are visible, Role “DE” has no restriction on [ProductCategory] so all product categories are visible. As Roles are additive the user is allowed to see all countries and also all product categories, hence the whole cube:

Basically you would expect to see “Bikes”-sales for all countries and “Germany”-sales for all product categories but you end up seeing much more than this. If you have every faced this problem in real life you know that this is probably not the intended behavior your customers want to see. Usually Active Directory Groups are used and assigned to SSAS roles, so this can happen quite easily without anyone even noticing (except the user who is happy to see more )!
Chris Webb wrote an excellent blog post on how to deal with those kinds of security requirements in multidimensional models here.

For tabular this behavior is somehow similar. A user belonging to both roles is also allowed to see all countries and all product categories – this is because security settings are additive, same as for multidimensional. Even though this is true in terms of the structure (rows, columns) of the query we still get a different result in terms of values!
Here is the same query on a tabular model with the same security settings:

This is exactly what we and our customers would expect to see – all sales for “Germany” and also every sale related to “Bikes”! In tabular models security applied to a given table cascades down to all related tables – in this case to our fact table. If a table is indirectly secured by different Roles which put security on different tables those restrictions are combined using OR. In terms of SQL this could be expressed as:

1. SELECT
2.     [ProductCategory],
3.     [Country],
4.     SUM([Reseller Sales])
5. FROM <table>
6. WHERE [ProductCategory] = 'Bikes'
7.     OR [Country] = 'Germany'
8. GROUP BY
9.     [Product Category],
10.     [Country]

Further pivoting the result would show the same as the MDX query.

Thinking back to some of my multidimensional cubes where I had to deal with multiple Roles this “slight difference” would have been really handy and would have saved me a lot of time that I put into custom security solutions using bridge tables, assemblies, etc.

In my next post I will go into much more detail on how the tabular security model works so stay tuned!

UPDATE: Part 2 can be found here

Upcoming Conferences and Events in Q1 2013

After PASS SQL Rally Nordic in Stockholm last year I am very happy to announce that I am going to speak at two more events in the first quarter of 2013.

I will do a session at the “Deutsche SQL Server Konferenz 2014” (=”German SQL Server Conference 2014″) on “Big Data Scenario mit Power BI vs. SAP HANA“. It is basically an advanced version of my blog post on SAP HANAs Big Data Scenario with Power BI with much more insights details on both technologies. The conference itself is a 3 day conference from 10th to 12th of February where day 1 is reserved for pre-conference sessions. It also features a lot of international speakers and of course also a good amount of English sessions (mine will be in German though). Also my colleague Marcel Franke will do a session about PDW and R which you also do not want to miss if you are into Big Data and predictive analytics!
Check out the agenda and make sure you register in time!

Later on the 6th of March I will speak at the SQLSaturday #280 in Vienna on “Scaling Analysis Services in the Cloud“. (This is not a typo, its really on 6th of March which is actually a Thursday!) The session focuses on how to get the best performance out of multidimensional Analysis Services solutions when they are moved to the Windows Azure cloud. In the end I will come up with some best practices and guide lines for this and similar scenarios.
Just make sure that you register beforehand to enjoy this full day of free sessions and trainings!

Hope to see you there!

Using Power Query to analyze SSAS Disk Usage

Some time ago Bob Duffy blogged about on how to use Power Pivot to analyze the disk usage of multidimensional Analysis Services models (here). He uses a an VBA macro to pull meta data like filename, path, extension, etc. from the file system or to be more specific from the data directory of Analysis Services. Analysis Services stores all its data in different files with specific extensions so it is possible to link those files to multidimensional objects in terms of attributes, facts, aggregations, etc. Based on this data we can analyze  how our data is distributed. Do we have too big dimensions? Which attribute uses the most space? Do our facts consume most of the space (very likely)? If yes, how much of it is real data and how big are my aggregations – if they are processed at all?!? – These are very common and also important things to know for an Analysis Services developer.

So Bob Duffy’s solution can be really useful. The only thing I did not like about it was the fact that it uses a VBA macro to get the data. This made me think and I came up with the idea of using Power Query to get this data. Btw, make sure to check out the latest release, there have been a lot of improvements recently!

With Power Query you have to option to load multiple files from a folder and also from its sub folders. When we do this on our Analysis Services data directory, we get a list of ALL files together with their full path, filename, extension and most important in this case their size which can be found by expanding the Attributes-record:

The final Power Query does also a lot of other things to prepare the data so it can be later joined to our FileExtensions-table that holds detailed information for each file extension. This table currently looks like below but can be extended by any other columns that may be necessary and/or useful for you:

 FileType FileType_Description ObjectType ObjectTypeSort ObjectTypeDetails ahstore Attribute Hash Store Dimensions 20 Attribute asstore Attribute String Store Dimensions 20 Attribute astore Attribute Store Dimensions 20 Attribute bsstore BLOB String Store Dimensions 20 BLOB bstore BLOB Store Dimensions 20 BLOB dstore Hierarchy Decoding Store Dimensions 20 Hierarchy khstore Key Hash Store Dimensions 20 Key ksstore Key String Store Dimensions 20 Key kstore Key Store Dimensions 20 Key lstore Structure Store Dimensions 20 Others ostore Order Store Dimensions 20 Others sstore Set Store Dimensions 20 Others ustore ustore Dimensions 20 Others xml XML Configuration 999 Configuration fact.data Basedata Facts 10 Basedata fact.data.hdr Basedata Header Facts 10 Basedata fact.map Basedata Index Facts 10 Basedata fact.map.hdr Basedata Index Header Facts 10 Basedata rigid.data Rigid Aggregation Data Facts 10 Aggregations rigid.data.hdr Rigid Aggregation Data Header Facts 10 Aggregations rigid.map Rigid Aggregation Index Facts 10 Aggregations rigid.map.hdr Rigid Aggregation Index Header Facts 10 Aggregations flex.data Flexible Aggregation Data Facts 10 Aggregations flex.data.hdr Flexible Aggregation Data Header Facts 10 Aggregations flex.map Flexible Aggregation Index Facts 10 Aggregations flex.map.hdr Flexible Aggregation Index Header Facts 10 Aggregations string.data String Data (Distinct Count?) Facts 10 Basedata cnt.bin Binary Configuration 999 Binaries mrg.ccmap mrg.ccmap DataMining 999 DataMining mrg.ccstat mrg.ccstat DataMining 999 DataMining nb.ccmap nb.ccmap DataMining 999 DataMining nb.ccstat nb.ccstat DataMining 999 DataMining dt dt DataMining 999 DataMining dtavl dtavl DataMining 999 DataMining dtstr dtstr DataMining 999 DataMining dmdimhstore dmdimhstore DataMining 999 DataMining dmdimstore dmdimstore DataMining 999 DataMining bin Binary Configuration 999 Binaries OTHERS Others Others 99999 Others

As you can see the extension may contain 1, 2 or 3 parts. The more parts the more specific this file extension is. If you checked the result of the Power Query it also contains 3 columns, FileExtension1, FileExtension2 and FileExtension3. To join the two tables we first need to load both tables into Power Pivot. The next step is to create a proper column on which we can base our relationship. If the 3-part extension is found in the file extensions table, we use it, otherwise we check the 2-part extension and afterwards the 1-part extension and in case nothing matches we use “OTHERS”:

=SWITCH(TRUE(),
CONTAINS(FileTypes, FileTypes[FileType], [FileExtension3]), [FileExtension3],
CONTAINS(FileTypes, FileTypes[FileType], [FileExtension2]), [FileExtension2],
CONTAINS(FileTypes, FileTypes[FileType], [FileExtension1]), [FileExtension1],
"OTHERS")

Then we can create a relationship between or PQ table and our file extension table. I also created some other calculated columns, hierarchies and measures for usability. And this is the final outcome:

You can very easily see, how big your facts are, the distribution between base-data and Aggregations, the Dimensions sizes and you can drill down to each individual file! You can of course also create a Power View report if you want to. All visualizations are up to you, this is just a very simple example of a report.

Enjoy playing around with it!

(please note that I added a filter on the Database name as a last step of the Power Query to only show Adventure Works databases! In order to get all databases you need to remove this filter!)

SSAS Disk Analysis Workbook: SSAS_DiskAnalysis.xlsx

Trigger Cube-Processing from Excel using Cube Actions

Recently I faced the requirement to give specific users of my Analysis Services Multidimensional cube the opportunity to process the cube whenever they want. There are several ways to achieve this:
– start a SQL Agent that processes the cube
– create some kind of WebService that does the processing
– create some custom .NET application that does the processing
– run some script (PowerShell, Command Line, ascmd.exe, …)

NOTE:
The post describes a solution which requires SSAS Server Admin rights. If you are interested in a solution which triggers an asynchronous processing without requiring SSAS Server Admin Rights please scroll down to the very bottom and download VS_Solution.zip
It basically runs the same script but in a separate thread in the context of the service account so the calling user does not need to have any specific rights on the server.

From a usability point-of-view none of the above is really convenient as all options involve some kind of external process or application. In this post I will show how to process a cube using cube actions which can be triggered from Excel Pivot Tables natively. So this is what we want to achieve:

This requires several steps:
1) create custom .Net assembly

Creating a Custom .Net assembly:

First of all we need to create a new “Visual C#” project of type “Class Library”. To work with Analysis Services we need to add two references:

“Microsoft.AnalysisServices” refers to Microsoft.AnalysisServices.dll which  can be found in the shared features folder of your installation (default is c:Program FilesMicrosoft SQL Server110SDKAssemblies)

“msmgdsrv” refers to msmgdsrv.dll which can found be in the OLAPbin-folder of your SSAS instance (default is c:Program FilesMicrosoft SQL ServerMSAS11.MSSQLSERVEROLAPbin)

Once these references are added, we can start to create our processing code:

using System;
using Microsoft.AnalysisServices;
using System.Data;

namespace ASSP_Processing
{
public class Processing
{
public const string LoggingPrefix = “ASSP.ProcessObject: “;

public enum ResultHandling
{
Datatable = 0,
Exception = 1,
NULL = 2
}

public static DataTable ProcessObject(string cubeName, ResultHandling resultHandling)
{
DataTable ret = new DataTable();

Server server = null;

try
{
server = new Server();

//connect to the current session…
server.Connect(“*”);

AdomdServer.Context.CheckCancelled(); //could be a bit long running, so allow user to cancel

string objectToProcessName = “#N/A”;
ProcessableMajorObject objectToProcess = db;

AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Database <“ + db.Name + “> found!”);
objectToProcessName = “DB[“ + db.Name + “]”;

if (!string.IsNullOrEmpty(cubeName))
{
Cube cube = db.Cubes.GetByName(cubeName);

if (cube != null)
{
objectToProcess = cube;
AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Cube <“ + cubeName + “> found!”);
objectToProcessName = objectToProcessName + ” > Cube[“ + cubeName + “]”;
}
}

if (objectToProcess != null)
{
AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Processing Object “ + objectToProcessName + ” …”);

objectToProcess.Process(ProcessType.ProcessFull);

ret.Rows.Add(new object[] { “S U C C E S S:    Object “ + objectToProcessName + ” successfully processed!” });
AdomdServer.Context.TraceEvent(0, 0, LoggingPrefix + “Finished Processing Object “ + objectToProcessName + “!”);
}
}
finally
{
try
{
if (server != null)
server.Disconnect();
}
catch { }
}

// if processing was successful a row has been added beforehand
if (ret.Rows.Count == 0)
{
ret.Rows.Add(new object[] { “F A I L U R E:    Error while processing an object!” });
}

switch (resultHandling)
{
case ResultHandling.Datatable:
return ret;

case ResultHandling.Exception:
throw new Exception(Environment.NewLine + Environment.NewLine + Environment.NewLine + ret.Rows[0][0].ToString());

case ResultHandling.NULL:
return null;
}

return null;
}

public static DataTable ProcessObject(string cubeName, int resultHandling)
{
return ProcessObject(cubeName, (ResultHandling)resultHandling);
}

public static DataTable ProcessObject(string cubeName)
{
return ProcessObject(cubeName, 1);
}
}
}

There are two things here that are worth to point out. First of all we need to somehow establish a connection to the server that hosts the SSAS database in order to process an object. The first thing that comes into mind would be to create a new connection to the server and run some processing XMLA. Unfortunately this does not work here because as this would result in a deadlock. The assembly will be called as a Rowset-Action which in terms of locking is similar to a query. So if we run separate processing command within our Rowset-Action this processing command can never be committed as there is always a query running on the database. The query (=Rowset-Action) will wait until the function is finished and the function waits until the processing is committed resulting in a classical deadlock!

To avoid this we need to connect to the current users sessions using “*” as our connection string:

//connect to the current session…
server.Connect(“*”);

The second thing to point out is the return value of our function. In order to be used in a Rowset-action, the function has to return a DataTable object. The results would then be displayed similar to a drill through and a separate worksheet would pop up in Excel showing the result:

From my point-of-view this can be a bit annoying as after each processing this new worksheet pops up and you loose the focus of the pivot table. So I investigated a bit and found another way to display the output of the processing.

When a drill through / action throws an exception, this will also be displayed in Excel without actually displaying the result of the drill through / action in a separate worksheet. By default it may look like this message:

Well, not really much information here right?

To display more information, e.g. that the processing was successful (or not) we can throw our own exception in the code after processing is done:

I added a separate parameter to control this behavior, in the code above the default would be option 3 – custom Exception.

Once this project is built the DLL can be added to the SSAS database. It is necessary to grant the assembly “Unrestricted” permission set:

The last step would be to create the appropriate cube actions that call the assembly:

It is necessary that the action is of Type=Rowset, otherwise it would not be displayed in Excel! The Action Target is up the you, in this example it will be displayed on every cell. The Action expression is the actual call to our assembly passing the cube name as a parameter. The second parameter controls how the result is displayed and could also be omitted.  In this example I added 3 separate actions, one for each result-type (as described above).

And that’s it! This simple steps allow you and your users to process any cube (or any other database object) from within your favorite client application, assuming it supports cube actions.

This is the final result where the three cube actions can be triggered from an Excel pivot table:

The attached zip-file includes everything that is necessary to recreate this example:
– SSAS project
– ASSP_Processing project

CubeProcessingAction.zip

Solution for Asynchronous Processing without Admin-Rights:
VS_Solution.zip