user2597933
user2597933

Reputation: 63

Alternative to using 'LIMIT' within a MySQL 'NOT IN' clause?

I have a table of tags and a table of shows. Each show has around 100 tags each, but I only want to extract 30 tags for each show.

The way I want to select the 30 tags is by first extracting the top 20, based on popularity (hit count).

Then I want to select 10 random tags that don't appear within the top 20, join them all together and order alphabetically.

The "tags" table contains 4 columns: ID, show_id, tag and hit_count

I'm not great at SQL, but this is what I came up with:

SELECT * FROM (

    (

        SELECT tag
        FROM tags
        WHERE show_id = x
        AND ID NOT IN 

            (SELECT ID
            FROM tags
            WHERE show_id = x
            ORDER BY hit_count DESC
            LIMIT 20)

        ORDER BY RAND() DESC
        LIMIT 10

    )

    UNION 

    (

        SELECT tag
        FROM tags
        WHERE show_id = x
        ORDER BY hit_count DESC
        LIMIT 20

    )

) AS reorder
ORDER BY reorder.tag ASC

However, MySQL returns the following error:

This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

If my version of MySQL doesn't support using 'LIMIT' in 'NOT IN' clauses then I need to completely rethink the SQL, but I'm struggling to find a solution. Can anyone help? Thanks.

UPDATE:

As an alternative to picking the random 10 tags, I also tried:

SELECT * FROM (

      SELECT tag
      FROM tags
      WHERE show_id = x
      ORDER BY hit_count DESC
      LIMIT 20,100

) AS rnd_10 ORDER BY RAND() LIMIT 10

But this returns some tags that should be limited to the top 20, and I can't figure out why :\

Upvotes: 1

Views: 1787

Answers (2)

FuzzyTree
FuzzyTree

Reputation: 32392

You can left join a derived table instead of using IN (subquery)

    SELECT t.tag
    FROM tags t
    LEFT JOIN (
        SELECT ID
        FROM tags
        WHERE show_id = x
        ORDER BY hit_count DESC
        LIMIT 20
    ) b ON b.ID = t.ID
    WHERE t.show_id = x AND b.ID IS NULL
    ORDER BY RAND() DESC
    LIMIT 10

Another way using subqueries if you have ties:

select * from tags where hit_count < (
    select hit_count from tags order by hit_count desc limit 1 offset 19
) order by rand() limit 10

Upvotes: 2

Ankit Bajpai
Ankit Bajpai

Reputation: 13509

Why you are using LIMIT in sub-queries. You can use it at the end of the query. Something like this -

SELECT * FROM (

  (

    SELECT tag
    FROM tags
    WHERE show_id = x
    AND ID NOT IN 

        (SELECT ID
        FROM tags
        WHERE show_id = x
        ORDER BY hit_count DESC)

    ORDER BY RAND() DESC
  )

  UNION 

  (

    SELECT tag
    FROM tags
    WHERE show_id = x
    ORDER BY hit_count DESC
  )
) AS reorder
ORDER BY reorder.tag ASC
LIMIT 20

Upvotes: -1

Related Questions