Reputation: 2125
This is all going to have to be pseudo as I am on my phone and have no internet access right now as I have just moved but its bugging the crap out of me. This also means I can't do code blocks please bear with me: I'll try.
I have a table with amounts in it, and I have a table with labels. I want to sum the amounts in the first table grouped by the labels. The problem is, if there are no records for a label existing in the table with the amounts then I don't get a record in the result set for that label. I need a record there with nulls for the amount tables field. Here is what some sample data might look like:
Columns: id, tpa, amt, link_to_label_table
Data:
1, GTL, 2000, 1
2, GTL, 1000, 1
Columns: link_to_amount_table, label_name
Data:
1, Label1
2, Label2
Select at.tpa, sum(at.amt) as amt, lt.label_name
From Amount_table as at
Left join Label_tabl lt on lt.link_to_amount_table = at.link_to_label_table
Where at.tpa = 'GTL'
Group by lt.label, at.tpa
Now this returns:
GTL, 3000, Label1
I tried selecting from the labels table then left joining the amount table and it still didn't give my desired results which are:
GTL, 3000, Label1
Null, Null, Label2
Is this possible with the sum and group by? The fields being grouped by have to be there otherwise you get an error. This is in DB2 by the way. Is there any way possible to get this to return the way I need it? I have to get the labels; they are dynamic.
Upvotes: 0
Views: 2314
Reputation: 754590
On the face of it, you want to have your labels table as the dominant table and the amounts table as the one that is outer joined.
SELECT a.tpa, sum(a.amt) as amt, l.label_name
FROM Label_table AS l
LEFT JOIN Amount_table AS a
ON l.link_to_amount_table = a.link_to_label_table
GROUP BY l.label, a.tpa
You have a condition Amount_table.tpa = 'GTL'
; it is not entirely clear why you have that, but presumably it is significant with more data in the tables. There are (at least) two ways you can incorporate that condition into the query (other than the one you chose - which eliminates the rows where a.tpa
is null).
SELECT a.tpa, sum(a.amt) as amt, l.label_name
FROM Label_table AS l
LEFT JOIN Amount_table AS a
ON l.link_to_amount_table = a.link_to_label_table
AND a.tpa = 'GTL'
GROUP BY l.label, a.tpa
Or:
SELECT a.tpa, sum(a.amt) as amt, l.label_name
FROM Label_table AS l
LEFT JOIN (SELECT *
FROM Amount_table
WHERE tpa = 'GTL') AS a
ON l.link_to_amount_table = a.link_to_label_table
GROUP BY l.label, a.tpa
A decent optimizer will produce the same query plan for both, so it probably doesn't matter which you use. There's an argument that suggests the second alternative is cleaner in that the ON clause is primarily for joining conditions, and the filter condition on a.tpa
is not a joining condition. There's another argument that says the first alternative avoids a sub-query and is therefore preferable. I'd validate that the query plans are the same and would probably choose the second, but it is a somewhat nebulous decision based on a mild preference.
Upvotes: 2
Reputation: 5650
You were so close on your second try. Change WHERE to AND. This has the effect of applying at.tpa='GTL' to the JOIN instead of applying it to the filter so you don't filter out the NULLs.
Upvotes: 0