Reputation: 1
Hello i am trying to make a website where i want a photo album.
Thing is that this site will have 2 tables, 1 table that stores name of photo album (TBLalbum
) and a unique id for that album (id
,albumname
).
The second table (TBLphotos
) stores information about uploaded photos and to what album each photo belongs(albumid
,photoname
and some more columns).
Now i want to list all album names which is not so hard to do, but also in same sql query i want it to show the first photo in each album.
I tried this:
SELECT DISTINCT albumid
FROM TBLphotos INNER JOIN TBLalbum
ON TBLalbum.id=TBLphotos.albumid
this does not give me the possibility to print out the column photoname
and if i include it like :
SELECT DISTINCT albumid,photoname
FROM TBLphotos INNER JOIN TBLalbum
ON TBLalbum.id=TBLphotos.albumid
then it doesn't get distinct
on just albumid
.
A bit hard for me to explain, but can anyone tell me how to solve this?
Upvotes: 0
Views: 77
Reputation: 2436
Try:
SELECT
a.albumid,
a.albumname,
(SELECT photoname FROM TBLphotos p WHERE p.albumid = a.id LIMIT 1) AS photoname
FROM TBLalbum a
The subquery limits the result set for each album to just one record with the LIMIT
clause. If you want a specific record from TBLphotos
, you can amend the subquery to put things in a certain order or do whatever you need to do.
Note that DISTINCT
is for filtering out duplicate rows, which is not (I think) what you are really trying to do here.
Upvotes: 1
Reputation: 72177
The correct query for your request is:
SELECT a.albumid, f.photoname
FROM TBLalbum a
INNER JOIN TBLphotos f # "f" from "first" photo
ON f.albumid = a.id # of this album
LEFT JOIN TBLphotos o # "o" from "other" photo
ON o.albumid = f.albumid # of the same album
AND o.photoid < f.photoid # "o" is before "f"
WHERE o.photoid IS NULL # no such "o" exists
This query joins table TBLalbum
(aliased as a
from "album") with TBLphotos
(aliased as f
from "first photo of this album"). Because of the INNER JOIN
, the albums without photos will not be included. Use LEFT JOIN
instead if you want them included.
Next, it joins the table TBLalbum
again (aliased as o
from "other photo"). The first join condition (o.albumid = f.albumid
) ensures the photos selected from o
are in the same album as their matches from f
. The second condition (o.photoid < f.photoid
) pairs a row from f
1 only with the rows from o
having a smaller photoid
(i.e. created earlier).
Replace this with your own definition of earlier in order to get the first photo as expected.
Because the second JOIN
is a LEFT JOIN
, it will add to the result set all the rows from f
1, including those that does not have any match in o
because there is no photo in o
that is earlier than the photo currently selected from f
. For these rows, a row full of NULL
s is used instead of the missing row from o
.
The WHERE
clause filters from the result set produced by the joins only the pairs (a
, f
, o
) that have NULL
values in o.photoid
; i.e. when no "other photo" in the same album was found in o
as being earlier than a certain photo from f
. These are the first photos of each album you expect.
1 The LEFT JOIN
combines the tables a
and f
already joined (left) with table o
(right). The result set will contain all the combinations of rows produced by the INNER JOIN
of a
and f
, at least once (and not all the rows from f
as said above; I preferred to write it this way to make it simpler).
2 I assumed there is a field named photoid
in table TBLphotos
and it is the PK
of the table (it's important to have distinct values). If you change the sorting criterion to define the first photo from the table (for example, using the upload time) and the field you use does not have distinct values then on tie the query returns all the photos having the same smallest value for the chosen column. You can fix this easily by keeping/adding into the ON
clause the condition that uses photoid
(or other field that could solve the tie).
3 You can put whatever fields you want/need into the SELECT
clause but only from tables a
and f
. Nothing stops you to put fields from o
but all of them are NULL
. Read the entire explanation again if you didn't understand why.
4 If you want to understand how this query works then remove the WHERE
clause, run it using a MySQL client and analyze the result set.
5 You'll find a lot of answers to similar questions on SO tagged greatest-n-per-group. If they use GROUP BY
avoid them because they are wrong and not conformant with the SQL standard (most of them work by accident). If they use sub-queries (with or without GROUP BY
in the inner query) they are probably correct but most of the time they are slow. Sub-queries are difficult to optimize, and a lot of times it is not even possible. If they use MySQL variables then they are also slow. I don't know why MySQL variables are slow, I guess it's because the query planner cannot make the best use of indexes in this case.
Upvotes: 1