jdk
jdk

Reputation: 143

Need some help optimising an SQL query

my client was given the following code and he uses it daily to count the messages sent to businesses on his website. I have looked at the MYSQL.SLOW.LOG and it has the following stats for this query, which indicates to me it needs optimising.

Count: 183 Time=44.12s (8073s) Lock=0.00s (0s) Rows_sent=17337923391683297280.0 (-1), Rows_examined=382885.7 (70068089), Rows_affected=0.0 (0), thewedd1[thewedd1]@localhost

The query is:

SELECT
  businesses.name AS BusinessName,
  messages.created AS DateSent,
  messages.guest_sender AS EnquirersEmail,
  strip_tags(messages.message) AS Message,
  users.name AS BusinessName
FROM
  messages
  JOIN users ON messages.from_to = users.id
  JOIN businesses ON users.business_id = businesses.id

My SQL is not very good but would a LEFT JOIN rather than a JOIN help to reduce the number or rows returned? Ive have run an EXPLAIN query and it seems to make no difference between the LEFT JOIN and the JOIN..

Basically I think it would be good to reduce the number of rows returned, as it is absurdly big..

Upvotes: 0

Views: 48

Answers (2)

Jon Armstrong
Jon Armstrong

Reputation: 4694

he uses it daily to count the messages sent to businesses

If this is done per day, why not limit this to messages sent in specific recent days?

As an example: To count messages sent per business per day, for just a few recent days (example: 3 or 4 days), try this:

SELECT businesses.name       AS BusinessName
     , messages.created      AS DateSent
     , COUNT(*) AS n
  FROM messages
  JOIN users      ON messages.from_to  = users.id
  JOIN businesses ON users.business_id = businesses.id
 WHERE messages.created BETWEEN current_date - INTERVAL '3' DAY AND current_date
 GROUP BY businesses.id
        , DateSent
 ORDER BY DateSent DESC
        , n DESC
        , businesses.id
;

Note: businesses.name is functionally dependent on businesses.id (in the GROUP BY terms), which is the primary key of businesses.

Example result:

+--------------+------------+---+
| BusinessName | DateSent   | n |
+--------------+------------+---+
| business1    | 2021-09-05 | 3 |
| business2    | 2021-09-05 | 1 |
| business2    | 2021-09-04 | 1 |
| business2    | 2021-09-03 | 1 |
| business3    | 2021-09-02 | 5 |
| business1    | 2021-09-02 | 1 |
| business2    | 2021-09-02 | 1 |
+--------------+------------+---+
7 rows in set

This assumes your basic join logic is correct, which might not be true.

Other data could be returned as aggregated results, if necessary, and the fact that this is now limited to just recent data, the amount of rows examined should be much more reasonable.

Upvotes: 1

DaveWuzHere
DaveWuzHere

Reputation: 255

Short answer: There is nothing "wrong" with your query, other than the duplicate BusinessName alias.

Long answer: You can add indexes to the foreign / primary keys to speed up searching which will do more than changing the query.

If you're using SSMS (SQL management studio) you can right click on indexes for a table and use the wizard.

Just don't be tempted to index all the columns as that may slow down any inserts you do in future, stick to the ids and _ids unless you know what you're doing.

Upvotes: 1

Related Questions