Reputation: 151
Have a basic table structure for "domain_keywords"
domain keyword
golflink.com golf courses
golflink.com golf
golflink.com golf courses near me
...
2ndswing.com golf clubs
2ndswing.com used golf clubs
...
So what I am trying to do is a query to take one domain like "golflink.com" and find other domains with matching keywords and get the 10 which have the most matches.
Have tried the following and it works on MySQL 8, but our shared web server (read: cannot upgrade) uses MySQL 5.7 so the "with" clause is not allowed and need an alternate way of doing this query:
WITH t1 AS (
SELECT keyword, domain
FROM domain_keywords
WHERE domain ='golflink.com'
GROUP BY domain, keyword
)
SELECT t1.domain AS 'Domain', t2.domain AS 'SimilarDomain', count(t2.keyword) AS 'SharedKeywordsNumber'
FROM t1, domain_keywords t2
WHERE t1.keyword = t2.keyword AND t1.domain != t2.domain
GROUP BY t1.domain, t2.domain
ORDER BY 3 DESC, 2
LIMIT 10
Upvotes: 1
Views: 47
Reputation: 15951
There is really no need for a CTE or even a subquery for this, also implicit/comma join notation is archaic and explicit joins should be used for all but the most basic throwaway queries. (Also, unless that is a transcription error in your query, that should not run in any version of MySQL; ' is not used for delimiting identifiers; ` is used for that. ' is only used for delimiting strings).
SELECT t1.domain AS `Domain`, t2.domain AS `SimilarDomain`, COUNT(*) AS `SharedKeywordsNumber`
FROM domain_keywords AS t1
INNER JOIN domain_keywords AS t2 ON t1.keyword = t2.keyword AND t1.domain != t2.domain
WHERE t1.domain ='golflink.com'
GROUP BY t1.domain, t2.domain
ORDER BY `SharedKeywordsNumber` DESC, `SimilarDomain`
LIMIT 10
;
Another sidenote: The ORDER BY field_ordinal
format has been deprecated for years, as it makes queries unnecessarily difficult to read or modify.
Upvotes: 1
Reputation: 17190
A first approach will be using a derived table in the FROM
clause similar to the one that you are using inside the WHIT
clause, try this:
SELECT
t1.domain AS 'Domain',
t2.domain AS 'SimilarDomain',
count(t2.keyword) AS 'SharedKeywordsNumber'
FROM
( SELECT
keyword, domain
FROM
domain_keywords
WHERE
domain ='golflink.com'
GROUP BY
domain, keyword ) AS t1
CROSS JOIN
domain_keywords AS t2
WHERE
t1.keyword = t2.keyword AND t1.domain != t2.domain
GROUP BY
t1.domain, t2.domain
ORDER BY
3 DESC, 2 LIMIT 10
As an improvement, I think you could also replace the CROSS JOIN
by one INNER JOIN
, in this way (but not 100% sure of this):
SELECT
t1.domain AS 'Domain',
t2.domain AS 'SimilarDomain',
count(t2.keyword) AS 'SharedKeywordsNumber'
FROM
( SELECT
keyword, domain
FROM
domain_keywords
WHERE
domain ='golflink.com'
GROUP BY
domain, keyword ) AS t1
INNER JOIN
domain_keywords AS t2 ON t1.keyword = t2.keyword AND t1.domain != t2.domain
GROUP BY
t1.domain, t2.domain
ORDER BY
3 DESC, 2 LIMIT 10
Upvotes: 1