Kreeeeee
Kreeeeee

Reputation: 35

PowerBI: How to get distinct count for a column in a table, while grouping for many columns separately

I have a table with multiple date columns, and a single label column, as shown by following code

Data = DATATABLE (
"Date1", DATETIME,
"Date2", DATETIME,
"Label", STRING,
{
    { "2020-01-01","2020-01-02", "A" },
    { "2020-01-01","2020-01-01", "A" },
    { "2020-01-01","2020-01-02", "B" },
    { "2020-01-01","2020-01-01", "D" },
    { "2020-01-01","2020-01-02", "E" },
    { "2020-01-02","2020-01-01", "A" },
    { "2020-01-02","2020-01-02", "B" },
    { "2020-01-02","2020-01-01", "C" }
}
)

I want to plot a chart of count of distinct labels for each day, when considering date1, as well as when considering date2. These need to be in same plot, as a clustered bar plot, as shown below. This means I need to get the values on a new date column.

The expected result looks like this,

Date                | value1    | value2
---------------------------------
1/1/2020 12:00:00 AM |  4       |   3 |
1/2/2020 12:00:00 AM |  3       |   3 |

clustered bar plot

Current Solution:

  1. I am creating two different tables for each of the counts, as follows

     Date1_Count = 
     ADDCOLUMNS (
         ALL ( Data[Date1] ),
         "Count",
             CALCULATE (
                 DISTINCTCOUNT ( Data[Label] )
             )
     )
    

    and

     Date2_Count = 
     ADDCOLUMNS (
         ALL ( Data[Date2] ),
         "Count",
             CALCULATE (
                 DISTINCTCOUNT ( Data[Label] )
             )
     )
    
  2. Then I create a third table with dates as such,

     Final_Counts = CALENDAR("2020-01-01", "2020-01-04")
    
  3. Next, I add relationship between the three dates, viz. Date1_Count table, Date2_Count table, and Final_Counts table

  4. Finally, I combine the data using RELATED function as follows

     value1 = RELATED(Date1_Count[Count])
     value2 = RELATED(Date2_Count[Count])
    

Question

Is there a simpler solution that does not require creating one table per date column? The current method is not scalable to many date columns.

Upvotes: 2

Views: 789

Answers (2)

Alexis Olson
Alexis Olson

Reputation: 40204

Assuming you only have a handful of date columns, you just need a single date dimension table and one measure per date column.

Define a date table to use on the x-axis (no relationships to other tables):

DimDate = CALENDAR("2020-01-01", "2020-01-04")

Then define measures that match the various date columns to the date table:

value1 =
CALCULATE (
    DISTINCTCOUNT ( Data[Label] ),
    Data[Date1] IN VALUES ( DimDate[Date] )
)

and

value2 =
CALCULATE (
    DISTINCTCOUNT ( Data[Label] ),
    Data[Date2] IN VALUES ( DimDate[Date] )
)

Result


If you have more than a handful of DateN columns, then you'd probably be best served to reshape your data where you unpivot all those columns.

For just the two you have the data would look like

Unpivot

In this case, you use Unpivot[Column] as the Legend and only need a single measure:

value = 
CALCULATE (
    DISTINCTCOUNT ( Unpivot[Label] ),
    Unpivot[Date] IN VALUES ( DimDate[Date] )
)

This gives a similar looking result:

One measure

Upvotes: 1

sergiom
sergiom

Reputation: 4877

It is possible to obtain the Final_Counts calculated table in one step, using ADDCOLUMNS to iterate over Data[Date1], and then calculating Value1 as the DISTINCTCOUNT over the Data table filtered on the currently iterated Date1.

This work thanks to the CALCULATE statement that triggers a context transition.

Obtaining the Value2 requires to create a new filter context over Date2 using the currently iterated Date1.

First we save the current Date1 in a variable to be used inside CALCULATE in the filter expression on Date2.

We also need REMOVEFILTERS( Data ) to remove the filter context over Date1 set by the context transition.

Final_Counts =
ADDCOLUMNS(
    ALL( Data[Date1] ),
    "Value1",
        CALCULATE(
            DISTINCTCOUNT( Data[Label] )
        ),
    "Value2",
        VAR CurrentDate = Data[Date1]
        RETURN
            CALCULATE(
                DISTINCTCOUNT( Data[Label] ),
                REMOVEFILTERS( Data ),
                Data[Date2] = CurrentDate
            )
)

Upvotes: 1

Related Questions