stevendesu
stevendesu

Reputation: 16801

Join on equal substring?

I have a database of games and prices, in which I recently found yet another issue. There are some games which appear in the database twice - once for the PC version and once for the Mac version. If there are two such versions of a game, I don't need the mac version in my database. If there is ONLY a mac version of the game, however, I would like to keep it.

The issue was made a bit simpler for me since all games that also have a mac version will use the exact same name plus a small snippet at the end to indicate it's for mac. This might take the form "Left 4 Dead 2 (Mac)", "Left 4 Dead 2 Mac", or "Left 4 Dead 2 [Mac Download]".

To try and find these entries I created the following query:

SELECT *
FROM `gamelist` g1
JOIN `gamelist` g2
ON
    SUBSTR(g1.`title`,0,20) = SUBSTR(g2.`title`,0,20)
    AND
    g1.`title`<>g2.`title`
WHERE
    (g1.`amazon_id` IS NOT NULL AND g2.`amazon_id` IS NOT NULL)
    OR (g1.`steam_id` IS NOT NULL AND g2.`steam_id` IS NOT NULL)
    OR (g1.`impulse_id` IS NOT NULL AND g2.`impulse_id` IS NOT NULL)
    ...

The idea should be fairly straightforward. The issue is that the SUBSTR() didn't seem to work at all. In a database with roughly 8,000 entries it returned about 64 million results. Obviously it's completely skipping the SUBSTR(g1.title,0,20) = SUBSTR(g2.title,0,20) line and joining the rows any time g1.title<>g2.title

How can I join on equal substrings like this?

Upvotes: 1

Views: 6351

Answers (1)

pilcrow
pilcrow

Reputation: 58589

SUBSTR(), in the manner you're using it, is one-indexed, not zero-indexed. You want

SUBSTR(g1.`title`,1,20) = SUBSTR(g2.`title`,1,20)

Upvotes: 3

Related Questions