Reputation: 2310
I often find myself performing a couple of independent joins off a table. For instance, say we have the table collections
, which has independent one-to-N relationships with both photos
and songs
, where N is from zero to many.
Now, say we want to get a collection, and both its (independently) associated photos and songs.
I would typically use something like this:
SELECT
collections.collectionid as collectionid,
photos.name as photo_name,
songs.name as song_name
FROM collections
LEFT JOIN photos ON collections.collectionid = photos.collectionid
LEFT JOIN songs ON collections.collectionid = songs.collectionid
WHERE collections.collectionid = 14
Of course, left-joining one table to two other tables, if the first join results in M
rows and the second in N
rows, gives M * N
rows. This would seem suboptimal in terms of database-traffic and performance.
+--------------+------------+-----------+
| collectionid | photo_name | song_name |
+--------------+------------+-----------+
| 14 | 'x' | 'a' | \
| 14 | 'x' | 'b' | - Each photo is returned 3 times,
| 14 | 'x' | 'c' | / because 3 songs are returned.
| 14 | 'y' | 'a' | \
| 14 | 'y' | 'b' |
| 14 | 'y' | 'c' | /
+--------------+------------+-----------+
Alternatively, you can perform two selects: two separate queries, each joining collections
to a different table, giving M + N
rows:
SELECT
collections.collectionid as collectionid
song.name as song_name
FROM collections
LEFT JOIN songs on collections.collectionid = songs.collectionid
WHERE collections.collectionid = 14
and:
SELECT
collections.collectionid as collectionid
photos.name as photo_name
FROM collections
LEFT JOIN photos on collections.collectionid = photos.collectionid
WHERE collections.collectionid = 14
giving:
+--------------+------------+ +--------------+------------+
| collectionid | song_name | | collectionid | photo_name |
+--------------+------------+ +--------------+------------+
| 14 | 'a' | | 14 | 'x' |
| 14 | 'b' | | 14 | 'y' |
| 14 | 'c' | +--------------+------------+
+--------------+------------+
My question: What is the best way to handle this?
Neither of the above seems optimal. So, is there another way that results in M + N
rows, yet can be done in a single query?
Upvotes: 5
Views: 5924
Reputation: 50970
Your first option (two independent JOINs) doesn't seem to provide you with a very useful result set (because the two subsidiary tables produce a semi-cartesian product and you have to de-duplicate the results in your application code).
The second option (two separate queries) is okay, unless you want to treat the results of the two queries as a single set for presentation purposes (for instance, sort them all together by a date field).
The best solution, I think, is to combine the two queries into one with UNION ALL
, producing a single result set with only the rows you actually want:
SELECT
collections.collectionid as collectionid,
photos.name as photo_name,
'photo' as document_type
FROM collections
LEFT JOIN photos on collections.collectionid = photos.collectionid
WHERE collections.collectionid = 14
UNION ALL
SELECT
collections.collectionid as collectionid,
song.name as photo_name
'song' as document_type
FROM collections
LEFT JOIN songs on collections.collectionid = songs.collectionid
WHERE collections.collectionid = 14
This kind of result set can be ORDERed BY
any field across the entire, combined set of records, allowing (for instance) to get the 20 most recent documents attached to the collection regardless of what type they are.
Upvotes: 5
Reputation: 30155
It does seem that the relationship between photos and permissions is undefined, which results in the cross-join you speak of. Yes, at face value, doing two queries is better than what you have. However, the real question is why do photos and permissions have no key-based relationship?
But perhaps I'm not understanding your overall schema. Perhaps all the permissions pertain to a single user. If yes, then I'd consider placing all the permissions on a single row (of several columns or in an XML blob), rather than in several rows. Doing so would permit a single query to fetch all the values without resulting in an unintentional cross-join.
Upvotes: 0