MoshiBin
MoshiBin

Reputation: 3196

Avoiding using the same subquery multiple times in a query

In an MMORPG server I am refactoring, I have two tables. One for items, and one for spells. Each item has up to 5 spells, so I went with a sparse-matrix format, having 5 columns for spell IDs.

The original devisers of this structure chose to use MyISAM which does not support referencing, causing the items table to contain items with non-existant spell IDs. I wish to find out which items have incorrect spell IDs in order to fix them and perhaps in the long run convert to InnoDB.

So far I've been able to come up with only this:

SELECT COUNT(*)
  FROM items
 WHERE spellid_1 NOT IN (SELECT entry FROM research.spell)
    OR spellid_2 NOT IN (SELECT entry FROM research.spell)
    OR spellid_3 NOT IN (SELECT entry FROM research.spell)
    OR spellid_4 NOT IN (SELECT entry FROM research.spell)
    OR spellid_5 NOT IN (SELECT entry FROM research.spell);

Is there a more elegant way to do so?

EDIT: NULL spellid_n counts as valid since it just means the item doesn't have a spell in that slot.

Upvotes: 3

Views: 4360

Answers (4)

Mladen Prajdic
Mladen Prajdic

Reputation: 15685

try the "reverse not in" as it is called:

SELECT COUNT(*)  
FROM items 
WHERE (SELECT entry FROM research.spell) NOT IN (spellid_1, spellid_2, 
                                                 spellid_3, spellid_4, 
                                                 spellid_5)

EDIT: ah thought there was only 1 value. then you can do this in an inner join:

SELECT  COUNT(*)  
FROM    items i
    join (SELECT entry FROM research.spell) t 
          on t.entry NOT IN (spellid_1, spellid_2, spellid_3, spellid_4, spellid_5)

Upvotes: 0

csaba
csaba

Reputation: 680

The normalization step suggested would be useful (to have a connection table for the many-to-many item-spell relation). A disadvantage of the denormalized version is that spells of items have an implicit ordering, we always have to deal with all of them, when for example checking if an item have a specific spell or not.

However, the storage engine optimizes the long sql with the 5 identical subquery, it shouldn't cause performance problems. An alternative phrasing would be, using the SQL '99 standard 'with' clause:

WITH   spellids(entry) AS SELECT entry FROM research.spell
SELECT COUNT(*)  
FROM   items
WHERE     spellid_1 NOT IN spellids OR spellid_2 NOT IN spellids
       OR spellid_3 NOT IN spellids OR spellid_4 NOT IN spellids
       OR spellid_5 NOT IN spellids ;

Not much shorter, and unfortunately MySQL does not support the 'with' clause (see this question) yet.

Upvotes: 2

hobodave
hobodave

Reputation: 29301

Spidey, good question. Try the following:

SELECT  COUNT(*)
FROM items i
LEFT JOIN research.spell spell1 ON i.spellid_1 = spell1.entry
LEFT JOIN research.spell spell2 ON i.spellid_2 = spell2.entry
LEFT JOIN research.spell spell3 ON i.spellid_3 = spell3.entry
LEFT JOIN research.spell spell4 ON i.spellid_4 = spell4.entry
LEFT JOIN research.spell spell5 ON i.spellid_5 = spell5.entry
WHERE spell1.entry IS NULL 
OR spell2.entry IS NULL 
OR spell3.entry IS NULL
OR spell4.entry IS NULL
OR spell5.entry IS NULL

The key here is that you want to LEFT JOIN your research.spell table, so that it includes items that don't have a corresponding row for the given JOIN condition. Then you filter that tableset where the right side of the join IS NULL. This gives you rows from the left side table (items) with no corresponding row in the right side table (research.spell).

EDIT:

Also note, that I left your initial SELECT COUNT(*) unchanged. This will give you the total number of items that have 1 or more invalid spells. You'll need to change this to SELECT i.id or something similar to fetch the IDs of items that have invalid spells.

Upvotes: 0

Tony Andrews
Tony Andrews

Reputation: 132720

It would have been more elegant to design the tables so that you didn't have 5 spellid columns in the same table - i.e by having an item_spell table that would allow any number of spells per item. Apart from being more future-proof (when you find you now need 6 spells), your query would become:

SELECT COUNT(DISTINCT item_id)
  FROM item_spells
 WHERE spell_id NOT IN (SELECT entry FROM research.spell);

As it is, you are forced to perform the check 5 times.

Upvotes: 2

Related Questions