highpost
highpost

Reputation: 1323

duplicate fields with an inner join

I'm having trouble understanding how to do a multi-table join without generating lots of duplicate fields.

Let's say that I have three tables:

If I do a simple select:

select family.id, family.name from family
    order by family.id;

I get a simple list:

ID Name
1  Smith
2  Jones
3  Wong

If I add an inner join:

select family.id, family.name, parent.first_name, parent.last_name
  from family
  inner join parent
  on parent.family = family.id
  order by family.id;

I get some duplicated fields:

ID Name   Parent
1  Smith  Howard Smith
1  Smith  Janet Smith
2  Jones  Phil Jones
2  Jones  Harriet Jones
3  Wong   Billy Wong
3  Wong   Rachel Wong

And if I add another inner join:

select family.id, family.name, parent.first_name, parent.last_name
  from family
  inner join parent
  on parent.family = family.id
  inner join child
  on child.family = family.id
  order by family.id;

I get even more duplicated fields:

ID Name   Parent         Child
1  Smith  Howard Smith   Peter Smith
1  Smith  Howard Smith   Sally Smith
1  Smith  Howard Smith   Fred Smith
1  Smith  Janet Smith    Peter Smith
1  Smith  Janet Smith    Sally Smith
1  Smith  Janet Smith    Fred Smith
2  Jones  Phil Jones     Mark Jones
2  Jones  Phil Jones     Melissa Jones
2  Jones  Harriet Jones  Mark Jones
2  Jones  Harriet Jones  Melissa Jones
3  Wong   Billy Wong     Mary Wong
3  Wong   Billy Wong     Jennifer Wong
3  Wong   Rachel Wong    Mary Wong
3  Wong   Rachel Wong    Jennifer Wong

What I would prefer, because it's more human readable, is something like this:

ID Name   Parent         Child
1  Smith  Howard Smith   Peter Smith
          Janet Smith    Sally Smith
                         Fred Smith
2  Jones  Phil Jones     Mark Jones
          Harriet Jones  Melissa Jones
3  Wong   Billy Wong     Mary Wong
          Rachel Wong    Jennifer Wong

I know that one of the benefits of an inner join is to avoid presenting excess information through a Cartesian product. But it seems that I get something similar with a multi-table join. Is there a way to summarize each group as shown above or will this require post-processing with a scripting language like Python?

Thanks,

--Dan

Upvotes: 5

Views: 2336

Answers (3)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726479

This is precisely the way the relation databases work: each row must contain all information in itself, with every single field that you request. In other words, each row needs to make sense in isolation from all other rows. If you do a single query and you need to get all three levels of information, you need to deal with eliminating duplicates yourself for the desired formatting.

Alternatively, you can run three separate queries, and then do in-memory joins in code. Although this may be desirable in certain rare situations, it is generally a wrong way of spending your development time, because RDBMS are usually much more efficient at joining relational data.

Upvotes: 2

Jizzle
Jizzle

Reputation: 214

You've hit it on the head. You'll need some post processing to get the results you're looking for.

Upvotes: 1

Joe Enos
Joe Enos

Reputation: 40383

SQL query results are always simple tabular data, so to get the results you're looking for would definitely not be a pretty query. You could do it, but it would involve quite a bit of query voodoo, storing things in temporary tables or using cursors, or some other funky workaround.

I'd definitely suggest using an external application to retrieve your data and format it appropriately from there.

ORMs like Entity Framework in .NET can probably do this pretty easily, but you could definitely do this with a few nested collections or dictionaries in any language.

Upvotes: 0

Related Questions