danronmoon
danronmoon

Reputation: 3873

Left Join Subselect with LIMIT in MySQL

I have 3 tables:

actor

|    FIELD |             TYPE | NULL | KEY | DEFAULT |          EXTRA |
|----------|------------------|------|-----|---------|----------------|
| actor_id | int(10) unsigned |   NO | PRI |  (null) | auto_increment |
| username |      varchar(30) |   NO |     |  (null) |                |


tag
|  FIELD |             TYPE | NULL | KEY | DEFAULT |          EXTRA |
|--------|------------------|------|-----|---------|----------------|
| tag_id | int(10) unsigned |   NO | PRI |  (null) | auto_increment |
|  title |      varchar(40) |   NO |     |  (null) |                |

actor_tag_count
|            FIELD |             TYPE | NULL | KEY |           DEFAULT |                       EXTRA |
|------------------|------------------|------|-----|-------------------|-----------------------------|
|         actor_id | int(10) unsigned |   NO | PRI |            (null) |                             |
|           tag_id | int(10) unsigned |   NO | PRI |            (null) |                             |
|       clip_count | int(10) unsigned |   NO |     |            (null) |                             |
| update_timestamp |        timestamp |   NO |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |

SQLFiddle

I want to get the 5 most frequent (highest clip_count) and most recently updated (latest update_timestamp) tags for each actor.

My attempted query is:

SELECT
    `a`.`actor_id`,
    `a`.`username`,
    GROUP_CONCAT(atc.clip_count) AS `tag_clip_counts`,
    GROUP_CONCAT(t.tag_id) AS `tag_ids`,
    GROUP_CONCAT(t.title) AS `tag_titles`
FROM
    `actor` AS `a`
LEFT JOIN (
    SELECT
        `atc`.`actor_id`,
        `atc`.`tag_id`,
        `atc`.`clip_count`
    FROM
        `actor_tag_count` AS `atc`
    INNER JOIN `actor` AS `a` USING (actor_id)
    ORDER BY
        atc.clip_count DESC,
        atc.update_timestamp DESC
    LIMIT 5
) AS `atc` USING (actor_id)
LEFT JOIN `tag` AS `t` ON atc.tag_id = t.tag_id
GROUP BY
    `a`.`actor_id`

The problem is that the left join subselect is only calculated once and the tags for every result in the set are only fetched from a pool of 5 tags.

Expected GROUP_CONCAT'd tag title results for Keanu Reeves:

comedy, scifi, action, suspense, western (Both western and documentary have a clip_count of 2, but western should come first because it has a later update_timestamp)

I'm not sure this is a point of any relevance, but I am executing other joins on the actors table but had them removed for this question. It would be highly preferable to make this all 1 query, but I'm stumped on how to do this even with 2 queries. 1-or-2-query solutions appreciated.

Upvotes: 0

Views: 1712

Answers (2)

danronmoon
danronmoon

Reputation: 3873

SQLFiddle, with the help of a very nice answer about using a GROUP_CONCAT limit workaround:

SELECT
    `a`.`actor_id`,
    `a`.`username`,
    SUBSTRING_INDEX(GROUP_CONCAT(atc.clip_count ORDER BY atc.clip_count DESC, atc.update_timestamp DESC), ',', 5) AS `tag_clip_counts`,
    SUBSTRING_INDEX(GROUP_CONCAT(t.tag_id ORDER BY atc.clip_count DESC, atc.update_timestamp DESC), ',', 5) AS `tag_ids`,
    SUBSTRING_INDEX(GROUP_CONCAT(t.title ORDER BY atc.clip_count DESC, atc.update_timestamp DESC), ',', 5) AS `tag_titles`
FROM
    `actor` AS `a`
LEFT JOIN actor_tag_count AS `atc` USING (actor_id)
LEFT JOIN `tag` AS `t` ON atc.tag_id = t.tag_id
GROUP BY
    `a`.`actor_id`

Upvotes: 1

Kickstart
Kickstart

Reputation: 21513

It is possible by adding a sequence number, but might not perform well on large tables.

Something like this (not tested):-

    SELECT actor_id,
        username,
        GROUP_CONCAT(clip_count) AS tag_clip_counts,
        GROUP_CONCAT(tag_id) AS tag_ids,
        GROUP_CONCAT(title) AS tag_titles
    FROM
    (
    SELECT  actor.actor_id,
            actor.username,
            atc.clip_count, 
            tag.tag_id,
            tag.title,
            @aSeq := IF(@aActorId = actor.actor_id, @aSeq, 0) + a AS aSequence,
            @aActorId := actor.actor_id
    FROM
    (
        SELECT actor.actor_id,
            actor.username,
            atc.clip_count, 
            tag.tag_id,
            tag.title
        FROM actor
        LEFT JOIN actor_tag_count AS atc ON actor.actor_id = atc.actor_id
        LEFT JOIN tag ON atc.tag_id = tag.tag_id
        ORDER BY actor.actor_id, atc.clip_count DESC, atc.update_timestamp DESC
    )
    CROSS JOIN (SELECT @aSeq:=0, @aActorId:=0)
    )
    WHERE aSequence <= 5
    GROUP BY actor_id, username

A alternative would be to have a subselect that has a correlated sub query in the select statement (with a limit of 5), and then have an outer query that does the group concats. Something like this (again not tested)

SELECT
    actor_id,
    username,
    GROUP_CONCAT(clip_count) AS tag_clip_counts,
    GROUP_CONCAT(tag_id) AS tag_ids,
    GROUP_CONCAT(title) AS tag_titles
FROM
(
SELECT
    a.actor_id,
    a.username,
    (
    SELECT
        atc.clip_count,
        t.tag_id,
        t.title
    FROM actor_tag_count AS atc ON a.actor_id = atc.actor_id
    LEFT JOIN tag t ON atc.tag_id = t.tag_id
    ORDER BY atc.clip_count DESC, atc.update_timestamp DESC
    LIMIT 5
)
FROM actor a
)
GROUP BY actor_id, username

Upvotes: 0

Related Questions