andig
andig

Reputation: 13868

MySQL left joining subquery to group by

I have a requirement where I need o group data into equal number ob rows. As mysql doesn't have rownum() I'm simulating this behaviour:

SET @row:=6; 
SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count 
FROM 
    (
        SELECT timestamp, value, @row:=@row+1 AS row 
        FROM data 
        WHERE channel_id=52 AND timestamp >= 0 ORDER BY timestamp
    ) AS agg 
GROUP BY row div 8
ORDER BY timestamp ASC;

Note: according to Can grouped expressions be used with variable assignments? this query may not be 100% correct, but it does work.

An additional requirement is to calculate the row difference between the grouped sets. I've looked for a solution joining the same table with a subquery:

SET @row:=6; 
SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count 
FROM 
    (
        SELECT timestamp, value, @row:=@row+1 AS row 
        FROM data 
        WHERE channel_id=52 AND timestamp >= 0 ORDER BY timestamp
    ) AS agg 
LEFT JOIN data AS prev
    ON prev.channel_id = agg.channel_id
    AND prev.timestamp = (
        SELECT MAX(timestamp) 
        FROM data
        WHERE data.channel_id = agg.channel_id
            AND data.timestamp < MIN(agg.timestamp)
    )
GROUP BY row div 8
ORDER BY timestamp ASC;

Unfortunately that errors:

Error Code: 1054. Unknown column 'agg.channel_id' in 'on clause'

Any idea how this query could be written?

Upvotes: 1

Views: 318

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269643

How about this version:

SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count, COALESCE(prev.timestamp, 0) AS prev_timestamp
FROM (SELECT d.*, @row:=@row+1 AS row 
      FROM data d CROSS JOIN
           (select @row := 6) vars
      WHERE channel_id = 52 AND timestamp >= 0 ORDER BY timestamp
     )  agg LEFT JOIN
    data prev
    ON prev.channel_id = agg.channel_id AND
       prev.timestamp = (SELECT MAX(timestamp) 
                         FROM data
                         WHERE data.channel_id = agg.channel_id AND
                               data.timestamp < agg.timestamp
                        )
GROUP BY row div 8
ORDER BY timestamp ASC;

This includes all the columns in the subquery. And it puts the variable initialization in the same query.

Upvotes: 0

Marc B
Marc B

Reputation: 360612

You never selected channel_id from your sbuquery, so it's not returned to the parent query, and is therefore invisible. Try

SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count 
FROM 
    (
        SELECT timestamp, value, @row:=@row+1 AS row, channel_id
                                                    ^^^^^^^^^^^^-- need this
        FROM data 

Since MySQL only sees and uses the fields you explicitly return from that subquery, and will NOT "dig deeper" into the table underlying the query, you need to select/return all of the fields you'll be using the parent queries.

Upvotes: 2

Related Questions