June Lewis
June Lewis

Reputation: 355

Pulling Trending Data WeeK Over Week

I am trying to pull some sums week over week, and the query I have built gets the data, but is horrificaly slow.

 SELECT app_client_id
 , (SELECT COUNT(b.app_id) FROM `li_appointments` AS b
         WHERE YEARWEEK(b.app_datetime) = YEARWEEK(DATE_ADD(NOW(),INTERVAL -1 WEEK))
         AND b.app_client_id = a.app_client_id)
 , (SELECT COUNT(b.app_id) FROM `li_appointments` AS b
         WHERE YEARWEEK(b.app_datetime) = YEARWEEK(DATE_ADD(NOW(),INTERVAL -2 WEEK))
         AND b.app_client_id = a.app_client_id)
 FROM `li_appointments` AS a

EXPLAIN gives me this:

 |id||select_type       ||table||type ||possible_keys||key  ||key_len||ref||rows ||Extra    |
 |1 ||PRIMARY           ||a    ||index|\N             |Extra||18      |\N  |61901||Using index|
 |3 ||DEPENDENT SUBQUERY||b    ||index|\N             |Extra||18      |\N  |61901||Using where; Using index|
 |2 ||DEPENDENT SUBQUERY||b    ||index|\N             |Extra||18      |\N  |61901||Using where; Using index|

The table being refrenced is :

 create table `li_appointments` (
`app_id` int (11),
`app_datetime` datetime ,
`app_facility` varchar (600),
`app_department` varchar (600),
`app_address` varchar (600),
`app_language` varchar (600),
`app_requesting_person` varchar (600),
`app_service_provider` varchar (600),
`app_client_id` int (11),
`app_client_other` varchar (600),
`app_medicaid_status` int (11),
`app_health_program` varchar (150),
`les_name` varchar (500),
`les_dob` varchar (75),
`les_medicaid_id` varchar (500),
`billing_total_time` time ,
`billing_workorder_received` int (11),
`billing_admin_fee` float ,
`billing_notes` varchar (3000),
`app_notes` varchar (3000),
`created_by` varchar (300),
`created_on` datetime ,
`modified_by` varchar (300),
`modified_on` datetime ,
`wo_entered_by` varchar (300),
`app_callback_num` varchar (135),
`terp_id` varchar (135),
`app_covered_by` varchar (135),
`covered_on` datetime ,
`uofu_csn` varchar (135)); 

Is there a better way to get this data?

Upvotes: 0

Views: 140

Answers (3)

Mike Brant
Mike Brant

Reputation: 71384

It is likely slow because you all your WHERE clauses are using things like YEARWEEK(b.app_datetime) that prevent use indexes and the fact that you are using subselects.

I don't really understand why you are using the subselects though. A simple case statement should eliminate such a need:

SELECT
    app_client_id,
    SUM(
        CASE
            WHEN app_datetime > DATE_ADD(NOW(),INTERVAL -1 WEEK) THEN 1
            ELSE 0
        END
    ) AS last_week_count,
    SUM(
        CASE
            WHEN app_datetime > DATE_ADD(NOW(),INTERVAL -1 WEEK) THEN 0
            WHEN app_datetime > DATE_ADD(NOW(),INTERVAL -2 WEEK) THEN 1
            ELSE 0
        END
    ) AS two_weeks_ago_count    
FROM `li_appointments`
GROUP BY app_client_id

Make sure you have an index on app_client_id.

It should also be noted that this would provide a rolling week over week data set. In your original question, you were using YEARWEEK() which could have actually provided data for more than a week at a time. For example Say it is Wednesday and I want to count all items with YEARWEEK() equal to last Wednesday. This would actually match all items from SUN/MON through SAT/SUN of that previous week (week start and end depending on mode your MySQL is running in with regards to week numbering).

If this is in fact what you want, this is even easier to achieve:

SELECT
    app_client_id,
    YEARWEEK(app_datetime) AS year_week,
    SUM(1) as weekly_count   
FROM `li_appointments`
GROUP BY `app_client_id`, `year_week`
HAVING `year_week` < YEARWEEK(NOW())
ORDER BY `year_week` DESC
LIMIT 2

This would give you data for last two full weeks, but not current week. If you want current week (which might be up to 6 days, 23:59:59 long) just remove the HAVING clause.

Upvotes: 1

PlantTheIdea
PlantTheIdea

Reputation: 16359

You might want to try a subquery with UNION ALL setup:

SELECT app_client_id
    ,SUM(app_id_count_1_week) AS app_id_count_1_week
    ,SUM(app_id_count_2_weeks) AS app_id_count_2_weeks
FROM (
    SELECT app_client_id
        ,COUNT(app_id) AS app_id_count_1_week
        ,CAST(0 AS INTEGER) AS app_id_count_2_weeks
    FROM li_appointments
    WHERE b.app_datetime >= CURRENT_DATE - INTERVAL '7' DAY
    GROUP BY app_client_id
    UNION ALL
    SELECT app_client_id
        ,CAST(0 AS INTEGER) AS app_id_count_1_week
        ,COUNT(app_id) AS app_id_count_2_weeks
    FROM li_appointments
    WHERE b.app_datetime >= CURRENT_DATE - INTERVAL '14' DAY
    GROUP BY app_client_id
) AS Both_Counts
GROUP BY app_client_id

This will do the aggregations as needed separately, UNION the results, and then the SUM on the top layer will add the actual COUNT and 0. The change in the syntax is making the WHERE clause ANSI-compliant.

Just an idea off the top of my head.

Upvotes: 0

Rahul
Rahul

Reputation: 77866

Change the subquery to join like below and see if it improves the performance

SELECT 
a.app_client_id,
COUNT(b.app_id) as appid_count
from app_client_id a join
li_appointments b on b.app_client_id = a.app_client_id
where 
YEARWEEK(b.app_datetime) in 
   (
   (YEARWEEK(DATE_ADD(NOW(),INTERVAL -1 WEEK)),
   (YEARWEEK(DATE_ADD(NOW(),INTERVAL -2 WEEK))
   )

Upvotes: 0

Related Questions