Dynamic ABC Analysis in PowerPivot using DAX

An ABC Analysis is a very common requirement for for business users. It classifies e.g. Items, Products or Customers into groups based on their sales and how much impact they had on the cumulated overall sales. This is done in several steps.

I just published a new version of the Dynamic ABC Analysis at www.daxpatterns.com. The article can be found here.

 

1) Order products by their sales in descending order
2) Cumulate the sales beginning with the best selling product till the current product
3) Calculate the percentage of the cumulated sales vs. total sales
4) Assign a Class according to the cumulated percentage

Marco Russo already blogged about this here. He does the classification in a calculated column based on the overall sales of each product. As calculated columns are processed when the data is loaded, this is not dynamic in terms of your filters that you may apply in the final report. If, for example, a customer was within Class A regarding total sales but had no sales last year then a report for last year that uses this classification may give you misleading results.

In this blog I will show how to do this kind of calculation on-the-fly always in the context of the current filters. I am using Adventure Works DW 2008 R2 (download) as my sample data and create a dynamic ABC analysis of the products.

The first thing we notice is that our product table is a slowly changing dimension of type 2 and there are several entries for the same product as every change is traced in the same table.

SCD2Table

So we want to do our classification on the ProductAlternateKey (=Business Key) column instead of our ProductKey (=Surrogate Key) column.

First we have to create a ranking of our products:

Rank CurrentProducts:=IF(HASONEVALUE(DimProduct[ProductAlternateKey]),
IF(NOT(ISBLANK([SUM SA])),
RANKX(
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA])))

Check if there is only one product in the current context and that this product also has sales. If this is the case we calculate our rank. We need to do the CALCULATETABLE to do the ranking within the currently applied filters on other columns of the DimProduct-table e.g. if a filter is applied to DimProduct[ProductSubcategoryKey] we want to see our ranking within that selected Subcategory and not against all Products.

I also created a measure [SUM SA] just to simplify the following expressions:

SUM SA:=SUM(FactInternetSales[SalesAmount])

 

The second step is to create a running total starting with the best-selling product/the product with the lowest rank:

CumSA CurrentProducts:=SUMX(
TOPN(
[Rank CurrentProducts],
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA]),
[SUM SA])

We use a combination of SUMX() and TOPN() here. TOPN() returns the top products ordered by [SUM SA]. Further we use our previously calculated rank to only get the products that have the same or more sales than the current product. For example if the current product has rank 3 we sum up the top 3 products to get our cumulated sum (=sum of the first 3 products) for this product. Again we need to use CALCULATETABLE() to retain other filters on the DimProduct-table.

 

The third step is pretty easy – we need to calculate percentage of the cumulated sales vs. the total sales:

CumSA% CurrentProducts:=
[CumSA CurrentProducts]
/
CALCULATE([SUM SA], ALL(DimProduct[ProductAlternateKey]))

This calculation is straight forward and should not need any further explanation.

The result of those calculations can be seen here:

Theory

 

To do our final classification we have to extend our model with a new table that holds our classes and their border-values:

Class LowerBoundary UpperBoundary
A 0 0.7
B 0.7 0.9
C 0.9 1

Class A should contain products which’s cumulated sales are between 0 and 0.7 – between 0% and 70%.
Class B should contain products which’s cumulated sales are between 0.7 and 0.9 – between 70% and 90%.
etc.

(This table can later be extended to support any number of classes and any boundaries between 0% and 100%.)

To get the boundaries of the selected class we create two measures that are later used in our final calculation:

MinLowerBoundary:=MIN([LowerBoundary])
MaxUpperBoundary:=MAX([UpperBoundary])

 

Our final calculation looks like this:

SA Classified Current:=IF(NOT(ISCROSSFILTERED(Classification[Class])),
[SUM SA],
CALCULATE(
[SUM SA],
FILTER(
VALUES(DimProduct[ProductAlternateKey]),
[MinLowerBoundary] < [CumSA% CurrentProducts]
&& [CumSA% CurrentProducts] <= [MaxUpperBoundary])))

If our Classification-table is not filtered, we just show our [SUM SA]-measure. Otherwise we extend the filter on our DimProduct[ProductAlternateKey] using our classification filtering out all products that do not fall within the borders of the currently selected class.

This measure allows us to see the changes of the classification of a specific product e.g. over time:

FinalResults_1

In 2006 our selected product was in Class C. For 2007 and 2008 it improved and is now in Class A. Still, overall it resides in Class B.

We may also analyze the impact of our promotions on the sales of our classified products:

FinalResults_2

Our Promotion “Touring-1000 Promotion” only had impact on products in Class C so we may consider to stop that promotion and invest more into the other promotions that affect all classes.

 

The classification can be used everywhere you need it – in the filter, on rows or on columns, even slicers work. The only drawback is that the on-the-fly calculation can take quite some time. If I find some time in the future i may try to further tune them and update this blog-post.

 

The example workbook can be downloaded here:

Though it is already in Office 2013 format an may not be opened with any previous versions of Excel/PowerPivot.
It also includes a second set of calculations that use the same logic as described above but does all the calculations without retaining any filters on the DimProducts-table. This allows you to filter on Class “A” and ProductSubcategory “Bike Racks” realizing that “Bike Racks” are not a Class “A” product or to see which Subcategories or Categories actually contain Class A, B or C products!

DAXMD and DefaultMembers

With Power View for Multidimensional Models – Preview Microsoft recently made the first SQL Server version available that allows you to query your multidimensional models using DAX, or the be exactly, using DAXMD. As tabular and multidimensional model are fundamentally different in terms of their underlying data structures , there are also some differences in how to query them. Jason Thomas already blogged about some of those differences here and showed how to query attributes in DAXMD. In this blog I will focus on how DefaultMembers are handled in DAXMD.

As you probably know, there is no concept of DefaultMembers in tabular models but they are essential for multidimensional models. In most cases the DefaultMember is the All-Member of the hierarchy. This is also no issue for tabular models but it gets more tricky if there is a DefaultMember defined in the multidimensional model and you query it using a query language that is designed for tabular models like DAXMD.

For my tests I used the AdventureWorks Multidimensional Model (enterprise) from SQL Server 2008R2 which can be downloaded here.

This model contains a Dimension called [Scenario] which is Non-Aggregateable and also has a DefaultMember defined. To get a list of all available scenarios you could simply write the following MDX:

SELECT
{[Measures].[Amount]} ON 0,
[Scenario].[Scenario].[Scenario].allmembers ON 1
FROM [Adventure Works]

The result looks like this:
image

To get the similar result in DAX you would usually write a query like this:

EVALUATE
    SUMMARIZE(
        'Scenario',
        'Scenario'[Scenario],
        "Amount", 'Financial Reporting'[Amount])

If you run this query you get an error that says, that

Column [Scenario] is part of composite key, but not all columns of the composite key are included in the expression or its dependent expression.

This is because the our [Scenario]-attribute has a NameColumn that is different from it’s KeyColumn. DAXMD handles this as it would be a composite key as the NameColumn may not be unique without the KeyColumn. So you always also have to include the Key-Column in your query. This is usually done by using the following syntax: ‘MyTable’[MyColumn.Key0] where Key0 refers to the first column of the composite key. So lets change our query and see what happens:

EVALUATE
    SUMMARIZE(
        'Scenario',
        'Scenario'[Scenario.Key0],
        "Amount", 'Financial Reporting'[Amount])

Now we get a different saying that the column [Scenario.Key0] could not be found in the source table. That’s very strange as this syntax works just well for all other attributes that have different Key- and NameColumns.

(I posted this issue on Connect as for me this is an inconsistent behavior – feel free to vote for it here)

So, to further investigate into this problem we first have to check what columns actually exist in our ‘Scenario’ table. This can be done using this query:

EVALUATE 'Scenario'

image

We notice several things here:
1) the column is called [Scenario.UniqueName] and opposed to [Scenario.Key0]
2) the resultset only contains 1 row
3) this one row is the DAXMD representation of our DefaultMember

Ok, now that we know how to reference the column we can adopt our query accordingly:

EVALUATE
    SUMMARIZE(
        'Scenario',
        'Scenario'[Scenario.UniqueName],
        "Amount", 'Financial Reporting'[Amount])

image

The query works now but still only shows only one row – our DefaultMember. This is the special behavior for DAXMD and how DefaultMembers are handled. It internally applies a filter on the columns where DefaultMembers are set. In this case, whenever you query your ‘Scenario’ table, it is internally handled like

FILTER(
    'Scenario',
    'Scenario'[Scenario.UniqueName] = "[Scenario].[Scenario].&[1]")

 

This internal filter can be removed like any other filter on a table or a column using the same functions that we would usually use – ALL(), ALLEXCEPT(), etc.

So to get our DAXMD equivalent to our MDX query we have to remove the filter from our ‘Scenario’ table:

EVALUATE
    SUMMARIZE(
        ALL('Scenario'),
        'Scenario'[Scenario.UniqueName],
        'Scenario'[Scenario],
        "Amount", 'Financial Reporting'[Amount])

image

Please also note that the last two rows [Budget Variance] and [Budget Variance %] are calculated members that are created in the cubes MDX script and they show up just like regular rows! So calculated members are seamlessly integrated into DAXMD – awesome! Especially  if you consider that DAXMD is NOT translated into MDX but is natively integrated into the engine!

In this blog I showed how DefaultMembers defined in multidimensional models are handled in a tabular query language like DAXMD and what pitfalls I encounter. I hope this blogs helps you to better understand these internals and not to make the same mistakes again that I already did.