sptfire101
sptfire101

Reputation: 13

power bi: Aggregation of Distinct Count Measure

I need to count how many teams have only one associated city with at least one employee. My data set has many columns, but the relevant ones for this case are the following:

Team    City    Employees
A      Shanghai    3
A      Beijing     5
B      Helsinki    1
B      Beijing     0
C      Berlin      10
D      Shanghai    5
D      Berlin      6

In the example above this count would be 2 (B and C), and I managed to create a measure that counts how many associated cities each team has using:

# of Cities = CALCULATE(DISTINCTCOUNT(Data[City]);FILTER('Data';Data'[Employee]>0))

In order to check which teams have employees in only one city I tried to use the simple measure:

Total Team w/ one city = IF( [# of Cities] = 1 ; 1 ; 0 ) 

but that is not aggregate-able.

Any ideas on how to do it? I'd prefer to have it as a measure instead of new calculated column so I can slice the results according to other columns in the dataset.

Thanks in advance!

Upvotes: 1

Views: 11267

Answers (1)

RADO
RADO

Reputation: 8148

There are several ways to do it; for example:

Total Team w/ one city =
COUNTROWS ( FILTER ( VALUES ( Data[Team] ), [# of Cities] = 1 ) )

How it works: VALUES creates a list of distinct teams. Then the list is filtered for # of Cities measure = 1 per team, and rows of the filtered list are counted.

Similarly, you can use SUMX function:

Total Team w/ one city =
SUMX( VALUES(Data[Team]), IF([# of Cities] = 1, 1))

Another common approach is to use SUMMARIZE function:

Total Team w/ one city =
COUNTROWS (
    FILTER (
        SUMMARIZE ( Data, 
                    Data[Team], 
                    "City Count", [# of Cities] ),
        [City Count] = 1 ) 
       )

Here, we first group Data table by Teams, and calculate each team's # of Cities metric. Then, the grouped table is filtered, leaving only teams with # of Cities = 1. The rows of the filtered table then are counted. The advantage of this approach vs the first one is that SUMMARIZE allows you to group table by more than one column, whereas VALUES allows only one. However, the first formula is faster (important if your data set is large).

By the way, # of Cities formula can be simplified a bit:

# of Cities = CALCULATE(DISTINCTCOUNT(Data[City]), Data'[Employees]>0)

Upvotes: 2

Related Questions