Reputation: 523
Is it possible to achieve this task with a single MySQL query?
Table urls. Fields {id, url}
1, www.mysite.kom
2, mysite.kom
3, anothersite.kom
Table logs. Fields {id, url_id, group_type - a number in the range of 1..10}
1, 1, 4
2, 1, 4
3, 2, 5
4, 2, 5
5, 3, 9
The result of the query in this example should be: 1 (mysite.com and www.mysite.com = 1)
THE GOAL:
Need to count all distinct urls recorded in logs table, but with a few conditions:
1) Urls with and without www. prefix, like mysite.kom and www.mysite.kom,
should be counted as 1 (not 2).
2) Have group_type in the range of 4..6
3) Now, any of these urls with group_type 4..6, which appear in the list of those with group_type lower than 4 - should be ignored and not counted at all.
The SQL code:
SELECT COUNT(DISTINCT TRIM(LEADING 'www.' FROM b.url))
FROM logs a
INNER JOIN urls b
ON a.url_id = b.id
WHERE (group_type BETWEEN 4 AND 6)
----- and this condition below -----
AND TRIM(LEADING 'www.' FROM b.url)
NOT IN (
SELECT TRIM(LEADING 'www.' FROM b.url)
FROM logs a
INNER JOIN urls b
ON a.url_id = b.id
WHERE (group_type < 4)
)
If my sql query is correct, can it be optimized (to look more compact)?
Upvotes: 1
Views: 347
Reputation: 13544
SELECT COUNT(DISTINCT u.id) AS COUNT_QUES FROM urls u
INNER JOIN logs l
ON u.id=l.url_id
WHERE u.url NOT IN (SELECT A.url FROM
(SELECT * FROM urls u
WHERE SUBSTR(u.url,1,3)!='www')A,
(SELECT * FROM urls v
WHERE SUBSTR(v.url,1,3)='www')B
WHERE A.url=SUBSTR(B.url,5,LENGTH(B.url))
)
AND l.group_type BETWEEN 4 AND 6
AND u.id NOT IN
(SELECT DISTINCT u.id FROM urls u
INNER JOIN logs l
ON u.id=l.url_id
WHERE u.url NOT IN (SELECT A.url FROM
(SELECT * FROM urls u
WHERE SUBSTR(u.url,1,3)!='www')A,
(SELECT * FROM urls v
WHERE SUBSTR(v.url,1,3)='www')B
WHERE A.url=SUBSTR(B.url,5,LENGTH(B.url))
)
AND l.group_type < 4
)
OR
SELECT COUNT(DISTINCT CASE WHEN B.URL_ID IS NOT NULL AND FLAG1 = 1 AND FLAG2 = 0 THEN TRIM(LEADING 'www.' FROM A.URL) END)
FROM URLS A
LEFT JOIN (SELECT URL_ID,
MAX(CASE WHEN GROUP_TYPE BETWEEN 4 AND 6 THEN 1 ELSE 0 END) FLAG1,
MAX(CASE WHEN GROUP_TYPE < 4 THEN 1 ELSE 0 END) FLAG2
FROM LOGS
GROUP BY URL_ID) B
ON A.ID = B.URL_ID
Hope this works for you.Check this on SQLFIDDLE - http://sqlfiddle.com/#!2/1fde2/39
Upvotes: 3
Reputation: 183584
Here's one way:
SELECT trimmed_url
FROM ( SELECT TRIM(LEADING 'www.' FROM urls.url) AS trimmed_url,
MIN(logs.group_type) AS min_group_type
FROM logs
JOIN urls
ON urls.id = logs.url_id
GROUP
BY trimmed_url
) t
WHERE min_group_type BETWEEN 4 AND 6
;
But only you can judge whether it looks more compact to you, and only testing can determine whether it performs better.
Upvotes: 1