Reputation: 1323
I'm having trouble understanding how to do a multi-table join without generating lots of duplicate fields.
Let's say that I have three tables:
If I do a simple select:
select family.id, family.name from family
order by family.id;
I get a simple list:
ID Name
1 Smith
2 Jones
3 Wong
If I add an inner join:
select family.id, family.name, parent.first_name, parent.last_name
from family
inner join parent
on parent.family = family.id
order by family.id;
I get some duplicated fields:
ID Name Parent
1 Smith Howard Smith
1 Smith Janet Smith
2 Jones Phil Jones
2 Jones Harriet Jones
3 Wong Billy Wong
3 Wong Rachel Wong
And if I add another inner join:
select family.id, family.name, parent.first_name, parent.last_name
from family
inner join parent
on parent.family = family.id
inner join child
on child.family = family.id
order by family.id;
I get even more duplicated fields:
ID Name Parent Child
1 Smith Howard Smith Peter Smith
1 Smith Howard Smith Sally Smith
1 Smith Howard Smith Fred Smith
1 Smith Janet Smith Peter Smith
1 Smith Janet Smith Sally Smith
1 Smith Janet Smith Fred Smith
2 Jones Phil Jones Mark Jones
2 Jones Phil Jones Melissa Jones
2 Jones Harriet Jones Mark Jones
2 Jones Harriet Jones Melissa Jones
3 Wong Billy Wong Mary Wong
3 Wong Billy Wong Jennifer Wong
3 Wong Rachel Wong Mary Wong
3 Wong Rachel Wong Jennifer Wong
What I would prefer, because it's more human readable, is something like this:
ID Name Parent Child
1 Smith Howard Smith Peter Smith
Janet Smith Sally Smith
Fred Smith
2 Jones Phil Jones Mark Jones
Harriet Jones Melissa Jones
3 Wong Billy Wong Mary Wong
Rachel Wong Jennifer Wong
I know that one of the benefits of an inner join is to avoid presenting excess information through a Cartesian product. But it seems that I get something similar with a multi-table join. Is there a way to summarize each group as shown above or will this require post-processing with a scripting language like Python?
Thanks,
--Dan
Upvotes: 5
Views: 2336
Reputation: 726479
This is precisely the way the relation databases work: each row must contain all information in itself, with every single field that you request. In other words, each row needs to make sense in isolation from all other rows. If you do a single query and you need to get all three levels of information, you need to deal with eliminating duplicates yourself for the desired formatting.
Alternatively, you can run three separate queries, and then do in-memory joins in code. Although this may be desirable in certain rare situations, it is generally a wrong way of spending your development time, because RDBMS are usually much more efficient at joining relational data.
Upvotes: 2
Reputation: 214
You've hit it on the head. You'll need some post processing to get the results you're looking for.
Upvotes: 1
Reputation: 40383
SQL query results are always simple tabular data, so to get the results you're looking for would definitely not be a pretty query. You could do it, but it would involve quite a bit of query voodoo, storing things in temporary tables or using cursors, or some other funky workaround.
I'd definitely suggest using an external application to retrieve your data and format it appropriately from there.
ORMs like Entity Framework in .NET can probably do this pretty easily, but you could definitely do this with a few nested collections or dictionaries in any language.
Upvotes: 0