Brian Howard
Brian Howard

Reputation: 11

Tableau blending with aggregation across nulls

I'd like to color chart using a secondary data source with only a subset of ids:

I have a primary data source that associates projects with a start date and a very large number of vendors - a project may have many vendors. I have a secondary data source that links a subset of my vendors to a rank indicating which tier they are in.

I'd like to do an area chart by year of projects, with coloring detail to show the breakdown by vendor tiers. Since a project may involve many vendors, I'd like to color each project by highest tier of any vendor involved (e.g. max(tier)). Vendors not in my secondary data source should be assigned the lowest tier (0).

I have a calculated field in my secondary source that generates a number representing the tier from the ranking (lower ranking is better, any ranking better than none):

Tier: if isnull([rank]) then 0 else if [rank] >= 200 then 2 else 1

In my primary, I define another calculated field to do the aggregation for each project by tier:

VendorClass: if max[Secondary].[tier] == 0 then "Low vendors" elseif max[Secondary].[tier] == 2 'Top Vendors' else 'Mid Vendors'

When I try to graph the distinct count of project ids vs the VendorClass, however, I see every project getting the same VendorClass ('Top').

I suspect this is caused by the VendorClass field dropping projects for which there is a null in the secondary, despite the calculated field's attempt to turn nulls into 0.

Any help appreciated.....

Update:

So here are my tables:

Primary source:

projects: project_id | name | date_started | etc. 123 | project1 | 2009-01-01 456 | project2 | 2010-01-01 789 | project3 | 2010-06-01

vendors: vendor_id | name | address | etc. 1 | Foo | main st 2 | Bar | state st 3 | Fob | church st 4 | Bof | broadway

project-vendor project_id | vendor_id 123 | 1 123 | 2 456 | 2 456 | 3 456 | 4 789 | 3 789 | 4

Secondary source: vendor_id | rank | tier 1 | 1 | 2 2 | 200 | 1

Note: I can pass the tier from my data source or calculate it as a field in the secondary - doesn't matter - but the key is that vendors not in this table (e.g. 3 and 4) should be assigned to tier 0. Because of the number of vendors and issues staying synced, it's impractical to add every vendor to this data source.

I want a graph to show for each year the total number of distinct projects, but I want the coloring for each case to represent the highest tier of any vendor on that project:

2009 has a single project, 2 vendors, the max tier among them is 2 (vendor 1) so it's a column 1 unit high in e.g. blue for tier 2.

2010 has two projects. 456 has a max tier of 1 (vendor 2) and 789 has a max tier of 0 (vendors 3 and 4), and so 2010 has a column 2 units high, one segment in e.g. green for tier 1 and one segment in, e.g., red for tier 0.

Upvotes: 1

Views: 1457

Answers (2)

Caleb
Caleb

Reputation: 359

Your blend isn't required in the example above. Try replacing the blend with a join and the calculation will be simplified.

Upvotes: 0

Alex Blakemore
Alex Blakemore

Reputation: 11896

Try defining the tier calculated field in the primary datasource (referencing fields in the secondary)

To see the syntax for referencing a field in another datasource, use the pulldown menu under fields in the calculation editor to select the data source you need to reference.

It will have to be an aggregate calculation, which implies that that the behavior of the calculation depends on the dimensions on the viz because they determine the partitioning.

Upvotes: 0

Related Questions