Reputation: 65

Possible to improve the performance of this SQL query?

I have a table that has over 100,000,000 rows and I have a query that looks like this:

SELECT
    COUNT(IF(created_at >= '2015-07-01 00:00:00', 1, null)) AS 'monthly',
    COUNT(IF(created_at >= '2015-07-26 00:00:00', 1, null)) AS 'weekly',
    COUNT(IF(created_at >= '2015-06-30 07:57:56', 1, null)) AS '30day',
    COUNT(IF(created_at >= '2015-07-29 17:03:44', 1, null)) AS 'recent'
FROM
    items
WHERE
    user_id = 123456;

The table looks like so:

CREATE TABLE `items` (
   `user_id` int(11) NOT NULL,
   `item_id` int(11) NOT NULL,
   `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
    PRIMARY KEY (`user_id`,`item_id`),
    KEY `user_id` (`user_id`,`created_at`),
    KEY `created_at` (`created_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

The explain looks fairly harmless, minus the massive row counts:

1   SIMPLE  items   ref PRIMARY,user_id user_id 4   const   559864  Using index

I use the query to gather counts for a specific user for 4 segments of time. Is there a smarter/faster way to obtain the same data or is my only option to tally these as new rows are put into this table?

Upvotes: 2

Answers (3)

Rick James

Reputation: 142296

INDEX(user_id, created_at) -- optimal
AND created_at >= '2015-06-30 07:57:56' -- helps because it cuts down on the number of index entries to touch
Doing a UNION does not help since it leads to 4 times as much work.
Doing for subquery SELECTs does not help for the same reason.

Also

COUNT(IF(created_at >= '2015-07-29 17:03:44', 1, null))

can be shortened to

SUM(created_at >= '2015-07-29 17:03:44')

(But probably does not speed it up much)

If the data does not change over time, only new rows are added, then Summary tables of past data would lead to a significant speedup, but only if you could avoid things like '07:57:56' for '30day'. (Why have '00:00:00' for only some of them?) Perhaps the speedup would be another factor of 10 on top of the other changes. Want to discuss further?

(I do not see any advantage in using PARTITION.)

Upvotes: 0

fthiella

Reputation: 49049

I would add an index on created_at field:

ALTER TABLE items ADD INDEX idx_created_at (created_at)

or (as Thomas suggested) since you are also filtering for user_id a composite index on created_at and user_id:

ALTER TABLE items ADD INDEX idx_user_created_at (user_id, created_at)

and then I would write your query as:

SELECT 'monthly' as description, COUNT(*) AS cnt FROM items
WHERE created_at >= '2015-07-01 00:00:00' AND user_id = 123456

UNION ALL

SELECT 'weekly' as description, COUNT(*) AS cnt FROM items
WHERE created_at >= '2015-07-26 00:00:00' AND user_id = 123456

UNION ALL

SELECT '30day' as description, COUNT(*) AS cnt FROM items
WHERE created_at >= '2015-06-30 07:57:56' AND user_id = 123456

UNION ALL

SELECT 'recent' as description, COUNT(*) AS cnt FROM items
WHERE created_at >= '2015-07-29 17:03:44' AND user_id = 123456

yes, the output is a little different. Or you can use inline queries:

SELECT
  (SELECT COUNT(*) FROM items WHERE created_at>=... AND user_id=...) AS 'monthly',
  (SELECT COUNT(*) FROM items WHERE created_at>=... AND user_id=...) AS 'weekly',
  ...

and if you want an average, you could use a subquery:

SELECT
  monthly,
  weekly,
  monthly / total,
  weekly / total
FROM (
  SELECT
    (SELECT COUNT(*) FROM items WHERE created_at>=... AND user_id=...) AS 'monthly',
    (SELECT COUNT(*) FROM items WHERE created_at>=... AND user_id=...) AS 'weekly',
    ...,
    (SELECT COUNT(*) FROM items WHERE user_id=...) AS total
) s

Upvotes: 1

Dumitrescu Bogdan

Reputation: 7267

If you have an index on created_at, I would also put in the where clause created_at >= '2015-06-30 07:57:56' which is the lowest date possible in your segment.

Also with the same index it might work splitting in 4 queries:

select count(*) AS '30day'
FROM
items
WHERE
    user_id = 123456
and created_at >= '2015-06-30 07:57:56'
union ....

And so on

Upvotes: 2

Possible to improve the performance of this SQL query?

Answers (3)

Related Questions