Boopathi D
Boopathi D

Reputation: 377

optimization for select of select in mysql query

My mysql query is taking nearly 7 seconds to fetch data, where I used select of select.

SELECT s.id
     , s.user_id
     , COUNT(s.od_status) od_status
     , (SELECT count(t2.od_status) 
          FROM subscription t2 
         WHERE od_status <> 3 
           AND t2.od_status <> 7 
           AND t2.od_status <> 8 
           AND t2.user_id = s.user_id 
           AND DATE(IF(t2.rescheduling_delivery_date IS NULL, t2.dated, t2.rescheduling_delivery_date)) BETWEEN DATE_FORMAT('2021-07-26', '%Y-%m-01') AND '2021-07-26'
        ) od_status_count 
  FROM subscription 
  LEFT 
  JOIN users u
    ON u.id = s.user_id 
 WHERE DATE(IF(s.rescheduling_delivery_date IS NULL, s.dated, s.rescheduling_delivery_date)) BETWEEN DATE_FORMAT('2021-07-26' , '%Y-%m-01') AND '2021-07-26' 
 GROUP 
    BY s.user_id;

I need to optimize the above query, the thing I am getting 7 secs is by adding below select of select in above query

(select count(t2.od_status) from subscription t2 where od_status<>3 and t2.od_status<>7 and t2.od_status<>8 
and t2.user_id=subscription.user_id and 
date(IF(t2.rescheduling_delivery_date IS NULL,t2.dated,t2.rescheduling_delivery_date)) 
between DATE_FORMAT('2021-07-26' ,'%Y-%m-01') AND '2021-07-26') as od_status_count

any suggestion to optimize or another way to use that.

NOTE: I have given index to the user_id in subscription table

Upvotes: 0

Views: 92

Answers (2)

Boopathi D
Boopathi D

Reputation: 377

As @RahulBiswas suggested, I changed the subquery to LEFT JOIN and the query executed in 0.16 secs

select 
`subscription`.`id`, 
`subscription`.`user_id`, 
COUNT(subscription.od_status) as od_status ,
x.od_status_count
from `subscription` left join `users` on `users`.`id` = `subscription`.`user_id`
left join(
  select t2.user_id,count(t2.od_status)as od_status_count from subscription t2 where od_status<>3 and t2.od_status<>7 and t2.od_status<>8 
and 
date(IF(t2.rescheduling_delivery_date IS NULL,t2.dated,t2.rescheduling_delivery_date)) 
between DATE_FORMAT('2021-07-26' ,'%Y-%m-01') AND '2021-07-26'
group by t2.user_id)x
on x.user_id=subscription.user_id    
where date(IF(subscription.rescheduling_delivery_date IS NULL,subscription.dated,subscription.rescheduling_delivery_date)) 
between DATE_FORMAT('2021-07-26' ,'%Y-%m-01') AND '2021-07-26' 
group by `subscription`.`user_id`

I changed the subquery to joins as below,

left join(
select t2.user_id,count(t2.od_status)as od_status_count from subscription t2 where od_status<>3 and t2.od_status<>7 and t2.od_status<>8 
and 
date(IF(t2.rescheduling_delivery_date IS NULL,t2.dated,t2.rescheduling_delivery_date)) 
between DATE_FORMAT('2021-07-26' ,'%Y-%m-01') AND '2021-07-26'
group by t2.user_id)x
on x.user_id=subscription.user_id

Upvotes: 1

The Impaler
The Impaler

Reputation: 48770

For the optimizer to be able to use indexes you'll need to:

  • Rephrase your query. Remove the IF() function by using OR or UNION/UNION ALL.
  • Remove date() function and replace it by timestamp comparison instead.

If you don't do that you can still get a somewhat better performance by adding an index for the subquery. This is the simplest option, but won't really get great improvements. You can add the following index:

create index ix1 on subscription (user_id, od_status);

For further improvement in performance you can use a covering index instead:

create index ix1 on subscription (
   user_id,
   od_status,
   rescheduling_delivery_date, 
   dated, 
   rescheduling_delivery_date
);

Upvotes: 0

Related Questions