Petah
Petah

Reputation: 46060

calculate difference of two calculated fields

I have this query, which basically gets the average spend of a customer over the last year, and 3 months:

SELECT SQL_CALC_FOUND_ROWS 
       customer_id,
       customer_name,
       AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 1_year_average_spend,
       AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 3_month_average_spend

FROM   customer_spends
GROUP  BY customer__id

But I also need to get the percent difference of the spend averages:

E.g. (pseudo code)

if (1_year_average_spend = 0)
    change = N/A
else 
    change = 3_month_average_spend / 1_year_average_spend - 1

How can, or what do you recommend I do to implement this?

The only way I can think of is horrible:

IF(
    AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            `spend_amount`,
            NULL
    )) > 0,
    AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
            spend_amount,
            NULL
    )) / AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            `spend_amount`,
            NULL
    )) - 1,
    "N/A"
) AS 3_month_performance

Upvotes: 4

Views: 1709

Answers (4)

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115620

Getting rid of IF(), DATE() and CONCAT() function calls. Your query, as it is now, has to scan the whole customer_spends table and check those complex conditions for ALL rows, even if they are 10 years old data.

This will also use an index of (year_of_spend, month_of_spend) or (customer_id, year_of_spend, month_of_spend) to speed up the query:

SELECT c.customer_id
     , c.customer_name
     , 1_year_average_spend
     , 3_month_average_spend
     , CASE WHEN 1_year_average_spend = 0
              THEN 'N/A'
              ELSE (3_month_average_spend / 1_year_average_spend) - 1
       END AS percent_difference
FROM
    customer AS c
  JOIN
    ( SELECT customer_id
           , AVG(spend_amount) AS 1_year_average_spend 
      FROM customer_spends 
      WHERE (year_of_spend, month_of_spend) >=
               ( YEAR(CUR_DATE() - INTERVAL 1 YEAR)
               , MONTH(CUR_DATE() - INTERVAL 1 YEAR)
               )
      GROUP BY customer_id
    ) AS grp1year
    ON grp1year.customer_id = c.customer_id
  LEFT JOIN
    ( SELECT customer_id
           , AVG(spend_amount) AS 3_month_average_spend
      FROM customer_spends 
      WHERE (year_of_spend, month_of_spend) >=
               ( YEAR(CUR_DATE() - INTERVAL 3 MONTH)
               , MONTH(CUR_DATE() - INTERVAL 3 MONTH)
               )
      GROUP BY customer_id
    ) AS grp3month
    ON grp3month.customer_id = c.customer_id    

Upvotes: 0

Bohemian
Bohemian

Reputation: 425208

Use an inner select (it's like a temporary view) and select from that. This should work:

SELECT
  customer_id,
  customer_name,
  1_year_average_spend,
  3_month_average_spend,
  if (1_year_average_spend = 0, "N/A", (3_month_average_spend / 1_year_average_spend) - 1) AS 3_month_performance
FROM (SELECT
  customer_id,
  customer_name,
  AVG(IF(DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
    DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"), spend_amount, NULL)) AS 1_year_average_spend,
  AVG(IF(DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
    DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"), spend_amount, NULL)) AS 3_month_average_spend)
FROM customer_spends 
GROUP BY customer_id, customer_name ) x

Upvotes: 1

Mike
Mike

Reputation: 21659

If you are happy to use MySQL specific code, you can use User-Defined Variables like this (simplified version):

SELECT @avg1 := ROUND((1 + 2 + 3) / 3, 2) AS avg1,
       @avg2 := ROUND((4 + 5 + 6) / 3, 2) AS avg2,
   IF( @avg1, ROUND(@avg2 / @avg1 - 1, 2), NULL ) AS result;

+------+------+--------+
| avg1 | avg2 | result |
+------+------+--------+
| 2.00 | 5.00 |   1.50 |
+------+------+--------+

That would become:

SELECT SQL_CALC_FOUND_ROWS 
       customer_id,
       customer_name,
       @1_year_average_spend := AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 1_year_average_spend,
       @3_month_average_spend := AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 3_month_average_spend,
       IF( @1_year_average_spend,
           @3_month_average_spend / @1_year_average_spend - 1,
           NULL
       ) AS diff

FROM   customer_spends
GROUP  BY customer__id

Note 1: I've used diff as the column name for the difference, as change is a reserved word, so may end up causing problems.

Note 2: You need to be aware of the following caveats from the docs, as they may affect your result:

Assignment of decimal and real values does not preserve the precision or scale of the value.

And:

As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed. The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement. In SELECT @a, @a:=@a+1, ..., you might think that MySQL will evaluate @a first and then do an assignment second. However, changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation.

So use with caution, and proper testing!

Upvotes: 1

Andriy M
Andriy M

Reputation: 77707

Would NULL do as the N/A value? If so, you could apply NULLIF() to the denominator like this:

a / NULLIF(b, 0) - 1

NULLIF returns NULL if its first argument is equal to the second argument. And if an operand is NULL, the entire expression evaluates to NULL.

Like @Bohemian, I, too, suggest using a subselect. Here's the complete query:

SELECT SQL_CALC_FOUND_ROWS
  customer_id,
  customer_name,
  1_year_average_spend,
  3_month_average_spend,
  3_month_average_spend / NULLIF(1_year_average_spend, 0) - 1 AS change
FROM (
  SELECT
    customer_id,
    customer_name,
    AVG(IF(
         DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
             DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
         spend_amount,
         NULL
    )) AS 1_year_average_spend,
    AVG(IF(
         DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
             DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
         spend_amount,
         NULL
    )) AS 3_month_average_spend
  FROM customer_spends
  GROUP BY customer__id
) s

Upvotes: 1

Related Questions