Reputation: 7805
So I have one query:
SELECT EWRporta2_articles.*, xf_thread.*, xf_forum.*, xf_user.*, xf_post.message,
IF(NOT ISNULL(xf_user.user_id), xf_user.username, xf_thread.username) AS username
FROM EWRporta2_articles
INNER JOIN xf_thread ON (xf_thread.thread_id = EWRporta2_articles.thread_id)
INNER JOIN xf_forum ON (xf_forum.node_id = xf_thread.node_id)
INNER JOIN xf_post ON (xf_post.post_id = xf_thread.first_post_id)
LEFT JOIN xf_user ON (xf_user.user_id = xf_thread.user_id)
WHERE EWRporta2_articles.article_date < 1417987751
AND xf_thread.discussion_state = 'visible'
ORDER BY EWRporta2_articles.article_date DESC
LIMIT 0, 5
This query executes in 0.0012 seconds... thats good. What this query does is query for a list of articles, then links them to a thread on my forum.
However, I am trying to slightly alter the query. While the query above requires an article row to exist for the thread. I would like to find threads that link to a specific article OR exist in a specific forum node id. So even if an article row doesn't exist for a thread, if it has a specific node_id, it will still show up. This is the query I have for that:
SELECT EWRporta2_articles.*, xf_thread.*, xf_forum.*, xf_user.*, xf_post.message,
IF(EWRporta2_articles.article_date IS NULL, xf_thread.post_date, EWRporta2_articles.article_date) AS article_date,
IF(NOT ISNULL(xf_user.user_id), xf_user.username, xf_thread.username) AS username
FROM xf_thread
LEFT JOIN EWRporta2_articles ON (EWRporta2_articles.thread_id = xf_thread.thread_id)
INNER JOIN xf_forum ON (xf_forum.node_id = xf_thread.node_id)
INNER JOIN xf_post ON (xf_post.post_id = xf_thread.first_post_id)
LEFT JOIN xf_user ON (xf_user.user_id = xf_thread.user_id)
WHERE ( xf_thread.node_id IN ('66','78') OR EWRporta2_articles.article_date IS NOT NULL )
AND IF(EWRporta2_articles.article_date IS NULL, xf_thread.post_date, EWRporta2_articles.article_date) < 1417987751
AND xf_thread.discussion_state = 'visible'
ORDER BY article_date DESC
LIMIT 0, 5
The problem with this query is it executes in 0.5683 seconds.
Is there anything I can do to improve performance here?
Upvotes: 0
Views: 74
Reputation: 36107
First of all - never ever don't do that:
AND IF(EWRporta2_articles.article_date IS NULL, xf_thread.post_date
, EWRporta2_articles.article_date) < 1417987751
Using a functon prevents any database to use indexes in order to optimize an expression like that.
Rewrite it into this form:
EWRporta2_articles.article_date IS NULL AND xf_thread.post_date < 1417987751
OR
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
And now use laws of boolean algebra to simplify the condition of the query,
especially using the law: Distirbutivity of AND over OR
: http://en.wikipedia.org/wiki/Boolean_algebra
X AND ( Y OR Z ) => X AND Y OR X AND Z
If you apply this law to this condition:
WHERE
( xf_thread.node_id IN ('66','78') OR EWRporta2_articles.article_date IS NOT NULL )
AND
(
EWRporta2_articles.article_date IS NULL AND xf_thread.post_date < 1417987751
OR
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
)
you will get:
xf_thread.node_id IN ('66','78') AND EWRporta2_articles.article_date IS NULL AND xf_thread.post_date < 1417987751
OR
xf_thread.node_id IN ('66','78') AND EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
OR
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date IS NULL AND xf_thread.post_date < 1417987751
OR
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
The third condition is always false, so we can skip it:
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date IS NULL ...
The second one and the fourth one:
xf_thread.node_id IN ('66','78') AND EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
OR
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
can be simplified into:
EWRporta2_articles.article_date IS NOT NULL AND EWRporta2_articles.article_date < 1417987751
and it can be further simplified to just:
EWRporta2_articles.article_date < 1417987751
In the end we will get:
WHERE (
xf_thread.node_id IN ('66','78')
AND
EWRporta2_articles.article_date IS NULL
AND
xf_thread.post_date < 1417987751
OR
EWRporta2_articles.article_date < 1417987751
)
AND xf_thread.discussion_state = 'visible'
And now divide the query into two separate subqueries, then union their results in this way:
SELECT *
FROM (
SELECT ......
JOIN ... JOIN ... JOIN ...
WHERE
xf_thread.node_id IN ('66','78')
AND EWRporta2_articles.article_date IS NULL
AND xf_thread.post_date < 1417987751
AND xf_thread.discussion_state = 'visible'
-- ORDER BY article_date DESC
LIMIT 0, 5
) q1
UNION
SELECT *
FROM (
SELECT ......
JOIN ... JOIN ... JOIN ...
WHERE
EWRporta2_articles.article_date < 1417987751
AND xf_thread.discussion_state = 'visible'
ORDER BY article_date DESC
LIMIT 0, 5
) q2
ORDER BY article_date DESC
LIMIT 0, 5
Notice that in the first subquery the clause ORDER BY article_date DESC
is skipped - since the condition says: AND EWRporta2_articles.article_date IS NULL
, then the date is always NULL, and we can skip redundant sort operation, because it's a waste of time.
Upvotes: 1