Reputation: 19648
I am working with some sales data that each transaction has multiple stages. say stage A, B, C, D...
In the perfect world, each transaction need to go to each stage and I want to calculate the difference the transaction stays at each stage.
Say the data might look like this:
transaction_id stage updated_date
1 A 2015-01-01
1 B 2015-01-02
1 C 2015-01-05
1 D 2015-01-06
I want something like:
transaction_id stage time_spent
1 AB 1 day
1 BC 3 days
1 CD 1 day
Then I will be able to analyze the time spent at each stage and look at the histogram or distribution. However, the only thing I can find is the function datediff.
Can anyone show me how to achieve this in Tableau?
Upvotes: 3
Views: 1810
Reputation: 11919
Here is one way to reshape your data. There are other alternatives.
This approach has a data row for each entrance or exit of a stage.
transaction_id date action stage
1 2015-01-01 enter A
1 2015-01-02 exit A
1 2015-01-02 enter B
etc
This means you have to be careful about enforcing data quality. Be sure no one exits a stage before entering, or that you don't have unmatched pairs of enter/exits (except for when a transaction is currently still in a stage and has entered but not yet exited) So its fine for automatically recorded data, less so for hand entered data.
Now you can count the # of transactions that passed through a stage by just counting exits, or track the # in a stage at any time by using a running total of entrances - exits. There is no assumption that transactions always follow the same path, so its possible to get rejected at C and return to A.
You could probably derive the time spent in each stage, but that might be easier to manage if you could add a column for that and include in the data for each exit.
As I mentioned, this is just one possible structure. The best table structure depends on the questions you want to answer.
Upvotes: 1
Reputation: 3423
You could try using LOOKUP()
to compute the duration using the date in the previous row of your partition:
DATEDIFF('day', LOOKUP(ATTR([Updated Date]), -1), ATTR([Updated Date]))
Then to get your "AB" labels, use LOOKUP()
again to fetch it from the previous row:
LOOKUP(ATTR([stage]), -1) + ATTR([stage])
To make a histogram with the fields, you can drag [stage]
to Columns, followed by [stage label]
(or whatever you choose to call that second calculated field we made). Drag [duration]
(the first field we made) into Rows. Then right click "A" in your chart and click "Hide". It's important that you hide it rather than filter it out, because it still needs to be in your partition for the table calculations to work! Then right click on the [stage]
pill in your Columns card and uncheck "Show Header".
It's important to note that if Stage C (for example) isn't present in your data, then you will end up with a "BD" in your histogram, so here's hoping your data's pristine.
Caveat - One cost of using table calculations like lookup() is that they only operate on aggregated query results, which means all the data rows in question have to be fetched from the database to the Tableau client or server. So if you have a large amount of data and only want to display summary results such as the min/avg/max time to complete each phase, then table calcs can be an expensive and complicated way to get those summary measures.
An alternative is to reshape your data and use row level calcs, aggregate calcs or LOD calcs -- all of which can be performed by a source database. That is often simpler to manage also.
Upvotes: 6