Calculating the Events-In-Progress is a very common requirement and many of my fellow bloggers like Chris Webb, Alberto Ferrari and Jason Thomas already blogged about it and came up with some really nice solutions. Alberto also wrote a white-paper summing up all their findings which is a must-read for every DAX and Tabular/PowerPivot developer.

However, I recently had a slightly different requirement where I needed to calculate the Events-In-Progress for Time Periods – e.g. the Open Orders in a given month – and not only for a single day. The calculations shown in the white-paper only work for a single day so I had to come up with my own calculation to deal with this particular problem.

Before we can start we need to identify which orders we actually want to count if a Time Period is selected. Basically we have to differentiate between 6 types of Orders for our calculation and which of them we want to filter or not:

Order | Definition |

Order1 (O1) | Starts before the Time Period and ends after it |

Order2 (O2) | Starts before the Time Period and ends in it |

Order3 (O3) | Starts in the Time Period and ends after it |

Order4 (O4) | Starts and ends in the Time Period |

Order5 (O5) | Starts and ends after the Time Period |

Order6 (O6) | Starts and ends before the Time Period |

For my customer an order was considered as “open” if it was open within the selected Time Period, so in our case we need to count only Orders O1, O2, O3 and O4. The first calculation you would usually come up with may look like this:

- [MyOpenOrders_FILTER] :=
- CALCULATE (
- DISTINCTCOUNT ( 'Internet Sales'[Sales Order Number] ),
- FILTER (
- 'Internet Sales',
- 'Internet Sales'[Order Date]
- <= CALCULATE ( MAX ( 'Date'[Date] ) )
- ),
- FILTER (
- 'Internet Sales',
- 'Internet Sales'[Ship Date]
- >= CALCULATE ( MIN ( 'Date'[Date] ) )
- )
- )

We apply custom filters here to get all orders that were __ordered on or before the last day__ and were also __shipped on or after the first day__ of the selected Time Period. This is pretty straight forward and works just fine from a business point of view. However, performance could be much better as you probably already guessed if you read Alberto’s white-paper.

So I integrate his logic into my calculation and came up with this formula (Note that I could not use the final Yoda-Solution as I am using a DISTINCTCOUNT here):

- [MyOpenOrders_TimePeriod] :=
- CALCULATE (
- DISTINCTCOUNT ( 'Internet Sales'[Sales Order Number] ),
- GENERATE (
- VALUES ( 'Date'[Date] ),
- FILTER (
- 'Internet Sales',
- CONTAINS (
- DATESBETWEEN (
- 'Date'[Date],
- 'Internet Sales'[Order Date],
- 'Internet Sales'[Ship Date]
- ),
- [Date], 'Date'[Date]
- )
- )
- )
- )

To better understand the calculation you may want to rephrase the original requirement to this: “An open order is an order that was open on at least one day in the selected Time Period”.

I am not going to explain the calculations in detail again as the approach was already very well explained by Alberto and the concepts are the very same.

An alternative calculation would also be this one which of course produces the same results but performs “different”:

- [MyOpenOrders_TimePeriod2] :=
- CALCULATE (
- DISTINCTCOUNT ( 'Internet Sales'[Sales Order Number] ),
- FILTER (
- GENERATE (
- SUMMARIZE (
- 'Internet Sales',
- 'Internet Sales'[Order Date],
- 'Internet Sales'[Ship Date]
- ),
- DATESBETWEEN (
- 'Date'[Date],
- 'Internet Sales'[Order Date],
- 'Internet Sales'[Ship Date]
- )
- ),
- CONTAINS ( VALUES ( 'Date'[Date] ), [Date], 'Date'[Date] )
- )
- )

I said it performs “different” as for all DAX calculations, performance also depends on your model, the data and the distribution and granularity of the data. So you should test which calculation performs best in your scenario. I did a simple comparison in terms of query performance for AdventureWorks and also my customer’s model and results are slightly different:

Calculation (Results in ms) | AdventureWorks | Customer’s Model |

[MyOpenOrders_FILTER] | 58.0 | 1,094.0 |

[MyOpenOrders_TimePeriod] | 40.0 | 390.8 |

[MyOpenOrders_TimePeriod2] | 35.5 | 448.3 |

As you can see, the original FILTER-calculation performs worst on both models. The last calculation performs better on the small AdventureWorks-Model whereas on my customer’s model (16 Mio rows) the calculation in the middle performs best. So it’s up to you (and your model) which calculation you should prefer.

The neat thing is that all three calculations can be used with any existing hierarchy or column in your Date-table and of course also on the Date-Level as the original calculation.

nice piece of work, Gerhard! Keep up the great work and contribution to the community! Happy Holidays!

Thanks for sharing!

I found that in the real world (hard to see in Adventure Works) you can achieve a good optimization by applying a filter on a low-cardinality column in the fact table. For example, if you are filtering a range of dates and the maximum distance between Order Date and Ship Date is 60 days, you might exclude all the “other” years. Depending on data distribution, this type of optimization can be very effective (you should remove at least 80-90% of rows in order to see the improvement).

Hey Marco,

I am aware that it is very beneficial in terms of performance if the distance between the two used dates (Order Date and Ship Date) is low but what exactly do you mean by “excluding all the ‘other’ years”?

I guess what you wanted to point out is that if e.g. August 2014 is selected and the maximum distance is 60 days the calculation only needs to consider June 2014 until October 2014 which would improve performance significantly

-gerhard

Yes exactly that!

Hey Gerhard,

this is an awesome piece of work, but it seems I have some trouble to use.

In my scenario I use an date table that is not related to the facttable, using all of your 3 solutions provide the same results (gladly).

This is something like an aggregated distinctcount over time …

Unfortunately this does not work with PowerView due to the lack of the relationship. As soon as I create the relationship I no longer have my (aggregated distinct count over time). Do you have an idea how I can solve my problem?

Kind regards and once again thanks for sharing

Tom

Hi Tom,

what exactly do you mean by “it does not work with PowerView”?

I guess you are referring to dynamic filtering? e.g. clicking on a bar-chart?

As soon as you have an active relationship between the Fact-table and the Date-table you would need to adopt the calculations. Otherwise you would have two relationships, the active one and the one that is created on the fly using FILTER/CONTAINS

-gerhard

Hey Gerhard,

in my first solution there was no relationship between the date table and the fact table this worked smoothly using the generate/contains approach.

Until PowerView reminded me that it needs a relationship, this was the reason why I wrote “it does not work in PowerView” 🙂

Now I have two date dimensions one is related (the normal date table) and one that is not related (even not inactive), I call it date_calc.

I adjusted the Generate/Contains approach a little and voila. I’m glad that there are some days off ahead, so that there is some time to optimize my solution.

Now I’m able to track the development (evolution) of things within a certain timeframe using a status, where the things reach a status (hopefully) or stay at their first status (the do not evolve).

Once again, thanks for your post! It saved me a lot of thinking 🙂

Hey Gerhard,

Thank you for this great post. I have a similar but slightly different question:

What would I do If i need to count the number of sales orders that have been ordered but not delivered on a single day?

Thank you!

Wesley

Hi Wesley,

I think this is very similar to what Chris Webb discussed in his post: http://blog.crossjoin.co.uk/2013/06/13/a-new-events-in-progress-dax-pattern/

and what is also used as a starting point for this blog post.

If your request is simply to find orders which were ordered but not delivered on the very same day, you could imply use this formula:

SoldButNotShippedSameDay:=CALCULATE(

DISTINCTCOUNT(‘Facts'[OrderID]),

FILTER(‘Facts’,

‘Facts'[OrderDate] <> ‘Facts'[ShipDate])

)

hope that helps to solve your issue

-gherard

Hi Gherard,

I have a similar problem to Wesley. The problem I am encountering is that I am receiving a time period from the report and would like to evaluate if an order was shipped but not received on the first day of that date range.

Thanks!

Nishka

Hi Nishka,

do you mean that if you select July, you want to find orders that have been shipped on July 1st but not received on July 1st?

or shipped on July 1st but not received in the whole time-period of the report?

-gerhard

Yes I mean the first scenario, if the selection is June I want to check if on June 1st the order was shipped but not received

sorry for the delay but I was on vacation.

so for orders that have been shipped on June 1st but not received on

a) June 1st

b) until June 31st (last day of selection)

c) until now (not received at all)

which one is correct for you?

Hello,

I have utilized the dax code behind this idea in an effort to breakdown open case age by month. Had to make a slight modification in terms of Min/Max and the inequality signs to pull the total open cases at the end of the month:

Open Cases = CALCULATE( DISTINCTCOUNT(DocWebTouchPoints_dw_Incident[IncidentID]),

FILTER( DocWebTouchPoints_dw_Incident, DocWebTouchPoints_dw_Incident[ArriveDateTime] CALCULATE(MAX(‘Date Table'[Date Key]))))

However, I am still getting total cases that arrived prior to end of month AND closed within the last month. (There is no relationship between fields used.)

Would greatly appreciate your assistance or a redirect to a webpage resource as this has been boggling my mind for a few days now.

Woops, this is the code:

Open Cases = CALCULATE( DISTINCTCOUNT(DocWebTouchPoints_dw_Incident[IncidentID]),

FILTER( DocWebTouchPoints_dw_Incident, DocWebTouchPoints_dw_Incident[ArriveDateTime] CALCULATE(MAX(‘Date Table'[Date Key]))))

Hi Vanessa,

it seems that the operator (“=”) got lost in your comment – assuming that it was an equal sign as in your other comment. then the calculation would simply return all distinct incidents which ARRIVED on the last day of the selected time period.

you would need to filter on both, Arrival-Date and Close-Date.

You should first check out the articles that i linked to at the beginning of the blog and make your calculation work for a single day – then you can adopt it accordingly as described here

kind regards,

-gerhard

Open Cases SR = CALCULATE ( DISTINCTCOUNT(DocWebTouchPoints_dw_Incident[IncidentID]),

FILTER(DocWebTouchPoints_dw_Incident, DocWebTouchPoints_dw_Incident[ArriveDateTime] = CALCULATE(MAX(‘Date Table'[Date Key])) && DocWebTouchPoints_dw_Incident[Type] = “SERVICE REQUEST”))

Hi Gerhard,

I have a similar situtation where user wants to select a date and the report has to find active ppl that point of time. I have only one table which has the start date and end date. And I have a time dimension. I don’t know which column should I join the date dimension to (if i should). As until I join ssas tabular won’t find a unique column.

Kindly share your expert suggestions.

Best Regards

GS

Hi GS,

the links at the very beginning of the post should cover your problem, this post is just an advanced version which also allows you to select a time range (e.g. months or weeks)

In general, you will not have an (active) relationship between your tables as relationships always require you to link two columns directly but do not allow “between”-joins. Thats why we need to use FILTER() and other functions to create this join on the fly.

but there also seems to be an issue with your Date-Table as it should contain unique dates and you should be able to link it to your fact-table – but again, for all Events-In-Progress-Calculations you must not have any active relationships between your tables

hth,

-gerhard