Reputation: 16801
I have a database of games and prices, in which I recently found yet another issue. There are some games which appear in the database twice - once for the PC version and once for the Mac version. If there are two such versions of a game, I don't need the mac version in my database. If there is ONLY a mac version of the game, however, I would like to keep it.
The issue was made a bit simpler for me since all games that also have a mac version will use the exact same name plus a small snippet at the end to indicate it's for mac. This might take the form "Left 4 Dead 2 (Mac)", "Left 4 Dead 2 Mac", or "Left 4 Dead 2 [Mac Download]".
To try and find these entries I created the following query:
SELECT *
FROM `gamelist` g1
JOIN `gamelist` g2
ON
SUBSTR(g1.`title`,0,20) = SUBSTR(g2.`title`,0,20)
AND
g1.`title`<>g2.`title`
WHERE
(g1.`amazon_id` IS NOT NULL AND g2.`amazon_id` IS NOT NULL)
OR (g1.`steam_id` IS NOT NULL AND g2.`steam_id` IS NOT NULL)
OR (g1.`impulse_id` IS NOT NULL AND g2.`impulse_id` IS NOT NULL)
...
The idea should be fairly straightforward. The issue is that the SUBSTR()
didn't seem to work at all. In a database with roughly 8,000 entries it returned about 64 million results. Obviously it's completely skipping the SUBSTR(g1.title,0,20) = SUBSTR(g2.title,0,20)
line and joining the rows any time g1.title<>g2.title
How can I join on equal substrings like this?
Upvotes: 1
Views: 6351
Reputation: 58589
SUBSTR(), in the manner you're using it, is one-indexed, not zero-indexed. You want
SUBSTR(g1.`title`,1,20) = SUBSTR(g2.`title`,1,20)
Upvotes: 3