Basket Analysis is a very common analysis especially for online shops. Most online shops make use of it to make you buy products that “Others also bought …”. Whenever you view a specific product “Product A” from the online shop, basket analysis allows the shop to show you further products that other customers bought together with “Product A”. Its basically like taking a look into other customers shopping baskets. In this blog post I will show how this can be done using Power Pivot. Alberto Ferrari already blogged about it here some time ago and showed a solution for Power Pivot v1. There is also dedicated chapter in the whitepaper The Many-to-Many Revolution 2.0 which deals with Basket Analysis, already in Power Pivot v2. Power Pivot v2 already made the formula much more readable and also much faster in terms of performance. Though, there are still some things that I would like to add.
Lets take a look at the initial data model first:
First of all we do not want to modify this model but just extend it so that all previously created measures, calculations and, most important, the reports still work. So the only thing we do is to add our Product-tables again but with a different name. Note that I also added the Subcategory and Category tables in order to allow Basket Analysis also by the Product-Category hierarchy. As we further do not want to break anything we only use an inactive relationship to our ‘Internet Sales’ fact table.
After adding the tables the model looks as follows:
The next and actually last thing to do is to add the following calculated measure:
CALCULATE (
COUNTROWS ( ‘Internet Sales’ ),
CALCULATETABLE (
SUMMARIZE (
‘Internet Sales’,
‘Internet Sales’[Sales Order Number]
),
ALL ( ‘Product’ ),
USERELATIONSHIP ( ‘Internet Sales’[ProductKey], ‘Filtered Product’[ProductKey] )
)
)
(formatted using DAX Formatter)
The inner CALCULATETABLE returns a list/table of all [Sales Order Numbers] where a ‘Filtered Product’ was sold and uses this table to extend the filter on the ‘Internet Sales’ table. It is also important to use ALL(‘Product’) here otherwise we would have two filters on the same column ([ProductKey]) which would always result in an empty table. Doing a COUNTROWS finally returns all for all baskets where the filtered product was sold.
We could also change ‘Internet Sales'[Sales Order Number] to ‘Internet Sales'[CustomerKey] in order to analyze what other customers bought also in different baskets (This was done for Example 3). The whole SUMMARIZE-function could also be replaced by VALUES(‘Internet Sales'[Sales Order Number]). I used SUMMARIZE here as I had better experiences with it in terms of performance in the past, though, this may depend on your data. The calculation itself also works with all kind of columns and hierarchies, regardless whether its from table ‘Product’, ‘Filtered Product’, or any other table!
So what can we finally do with this neat formula?
1) Classic Basket Analysis – “Others also bought …”:
As we can see Hydration Packs are more often sold together with Mountain Bikes opposed to Road Bikes and Touring Bikes. We could also use a slicer on ‘Filtered Product Subcategory’=”Accessories” in order to see how often Accessories are sold together with other products. You may analyze by Color and Product Category:
As we can see people that buy black bikes are more likely to buy red helmets than blue helmets.
2) Basket Analysis Matrix:
What may also be important for us is which products are sold together the most often? This can be achieved by pulling ‘Product’ on rows and ‘Filtered Product’ on columns. By further applying conditional formatting we can identify correlations pretty easy:
Water Bottles are very often sold together with Bottle Cages – well, not really a surprise. Again, you can also use all kind of hierarchies here for your analysis.
This is what the whole matrix looks like:
The big blank section in the middle are our Bikes. This tells us that there is no customer that bought two bikes in the same order/basket.
For this analysis I used an extended version of the calculation above to filter out values where ‘Product’ = ‘Filtered Product’ as of course every product is sold within its own basket:
IF (
MIN ( ‘Product’[ProductKey] )
<> MIN ( ‘Filtered Product’[ProductKey] ),
[Sold in same Order]
)
3) Find Customers that have not bough a common product yet
As we now know from the above analysis which products are very often bought together we can also analyze which customers do not fall in this pattern – e.g. customers who have bough a Water Bottle but have not bought a Bottle Cage yet. Again we can extend our base-calculation to achieve this:
IF (
NOT ( ISBLANK ( [Sum SA] ) ) && NOT ( [Sold to same Customer] ),
“Not Sold Yet”
)
The first part checks if the selected ‘Product’ was sold to the customer at all and the second part checks if the ‘Filtered Product’ was not sold to the customer yet. In that case we return “Not Sold Yet”, and otherwise BLANK() which is the default if the third parameter is omitted. That’s the result:
Aaron Phillips has bought a Water Bottle but no Mountain Bottle Cage nor a Road Bottle Cage – maybe we should send him some advertisement material on Bottle Cages?
As you can see there are a lot of analyses possible on top of that little measure that we created originally. All work with any kind of grouping or hierarchy that you may have and no change to your data model is necessary, just a little extension.
And that’s it – Basket Analysis made easy using Power Pivot and DAX!
Downloads:
Sample Excel Workbook with all Examples: BasketAnalysis.xlsx
Sample PowerBI Desktop file with all Examples (DAX only, no Visuals): BasketAnalysis.pbix