RononDex
RononDex

Reputation: 4173

Merge 2 Rows selected by union query

Okay, I am having a really strange issue.

I am trying to load some sensor data from a mysql database. The table structure looks like this:

+-----------------+---------------------+--------------+
+ PK [bigint(20)] +  timed [bigint(20)] + NO2 [double] +
+-----------------+---------------------+--------------+

Now comes the hard part. There are two identical tables that have the exact same schema as described above. One for raw data and one for validated data.

Until the data gets validated it usually takes around 1 month. So when I display live data from the sensor I am merging those two tables using UNION. The validated value should always have priority over the raw value. Also I only load daily average values.

This is the SQL query I created so far:

SELECT  
    mergedData.rawValue,
    mergedData.validatedValue,
    CAST(IF(mergedData.validatedValue IS NULL, mergedData.rawValue, mergedData.validatedValue) AS DECIMAL(65, 2)) as sensorValue,
    timeValue,
    IsValid
FROM
(SELECT 
    Month(FROM_UNIXTIME(timed / 1000)) as months,
    Year(FROM_UNIXTIME(timed / 1000)) as years,
    DAYOFMONTH(FROM_UNIXTIME(timed / 1000)) as days,
    HOUR(FROM_UNIXTIME(timed / 1000)) as hours,
    MINUTE(FROM_UNIXTIME(timed / 1000)) as minutes,
    avg(NO2) as rawValue,
    FROM_UNIXTIME(timed / 1000) as timeValue,
    IF(NO2 IS NOT NULL, 1, 0) as IsValid,
    NULL as validatedValue
FROM
    nabelnrt_pay
WHERE
    timed > 1360236120000
    AND timed < 1391772120000
GROUP BY years, months, days
UNION
SELECT 
    Month(FROM_UNIXTIME(timed / 1000)) as months,
    Year(FROM_UNIXTIME(timed / 1000)) as years,
    DAYOFMONTH(FROM_UNIXTIME(timed / 1000)) as days,
    HOUR(FROM_UNIXTIME(timed / 1000)) as hours,
    MINUTE(FROM_UNIXTIME(timed / 1000)) as minutes,
    avg(NO2) as validatedValue,
    FROM_UNIXTIME(timed / 1000) as timeValue,
    IF(NO2 IS NOT NULL, 1, 0) as IsValid,
    NULL as rawValue
FROM
    nabelvalidated_pay
WHERE
    timed > 1360236120000
    AND timed < 1391772120000
GROUP BY years, months, days) as mergedData ORDER BY timeValue

This results in following result: enter image description here

The problem I know have is, that when both table contain an entry I get two rows out of the query. I want to merge them to one row, so I got only 1 row per timestamp in the result. Also for some reason the data of mergedData.validatedValue gets displayed in rawValue instead of validatedValue.

Can someone tell me how I cam merge those two rows to 1 single resulting row and why the validatedData gets displayed in the wrong column?

Upvotes: 0

Views: 306

Answers (2)

chingeez
chingeez

Reputation: 100

The order of columns must be same in queries used with union. Your query should be something similar to:

SELECT max(mergedData.rawValue),
       max(mergedData.validatedValue),
       CAST(IF(max(mergedData.validatedValue) IS NULL, max(mergedData.rawValue),  
       max(mergedData.validatedValue)) AS DECIMAL(65, 2)) as sensorValue,
       max(timeValue),
       max(IsValid)  
FROM ((SELECT Month(FROM_UNIXTIME(timed / 1000)) as months,
          Year(FROM_UNIXTIME(timed / 1000)) as years,
          DAYOFMONTH(FROM_UNIXTIME(timed / 1000)) as days,
          HOUR(FROM_UNIXTIME(timed / 1000)) as hours,
          MINUTE(FROM_UNIXTIME(timed / 1000)) as minutes,
          avg(NO2) as rawValue,
          FROM_UNIXTIME(timed / 1000) as timeValue,
          0 IsValid,
          NULL as validatedValue
  FROM nabelnrt_pay
  WHERE timed > 1360236120000 AND timed < 1391772120000
  GROUP BY years, months, days
 )
 UNION ALL
 (SELECT Month(FROM_UNIXTIME(timed / 1000)) as months,
         Year(FROM_UNIXTIME(timed / 1000)) as years,
         DAYOFMONTH(FROM_UNIXTIME(timed / 1000)) as days,
         HOUR(FROM_UNIXTIME(timed / 1000)) as hours,
         MINUTE(FROM_UNIXTIME(timed / 1000)) as minutes,
         NULL as rawValue,
         FROM_UNIXTIME(timed / 1000) as timeValue,
         1 AS IsValid,
         avg(NO2) as validatedValue
  FROM nabelvalidated_pay
  WHERE timed > 1360236120000 AND timed < 1391772120000
  GROUP BY years, months, days
 )
) as mergedData
GROUP BY years, months, days
ORDER BY timeValue

After this change, you will get two rows for validated readings and one row for the non-validated ones. The last group by you used will group these 2 rows for validated readings. Because you use aggregate functions in the select like max(mergedData.rawValue), you will get desired results.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269563

The problem is that you are getting both rows in the final data. You need to aggregate again. In addition, because you want both values in the final result, you should use union all rather than union. union removes duplicates, which is unnecessary because the rows are not going to be the same between the two subqueries:

SELECT max(mergedData.rawValue),
       max(mergedData.validatedValue),
       CAST(IF(mergedData.validatedValue IS NULL, mergedData.rawValue, mergedData.validatedValue) AS DECIMAL(65, 2)) as sensorValue,
       timeValue,
       IsValid
FROM ((SELECT Month(FROM_UNIXTIME(timed / 1000)) as months,
              Year(FROM_UNIXTIME(timed / 1000)) as years,
              DAYOFMONTH(FROM_UNIXTIME(timed / 1000)) as days,
              HOUR(FROM_UNIXTIME(timed / 1000)) as hours,
              MINUTE(FROM_UNIXTIME(timed / 1000)) as minutes,
              avg(NO2) as rawValue,
              FROM_UNIXTIME(timed / 1000) as timeValue,
              IF(NO2 IS NOT NULL, 1, 0) as IsValid,
              NULL as validatedValue
      FROM nabelnrt_pay
      WHERE timed > 1360236120000 AND timed < 1391772120000
      GROUP BY years, months, days
     )
     UNION ALL
     (SELECT Month(FROM_UNIXTIME(timed / 1000)) as months,
             Year(FROM_UNIXTIME(timed / 1000)) as years,
             DAYOFMONTH(FROM_UNIXTIME(timed / 1000)) as days,
             HOUR(FROM_UNIXTIME(timed / 1000)) as hours,
             MINUTE(FROM_UNIXTIME(timed / 1000)) as minutes,
             avg(NO2) as validatedValue,
             FROM_UNIXTIME(timed / 1000) as timeValue,
             IF(NO2 IS NOT NULL, 1, 0) as IsValid,
             NULL as rawValue
      FROM nabelvalidated_pay
      WHERE timed > 1360236120000 AND timed < 1391772120000
      GROUP BY years, months, days
     )
    ) as mergedData
GROUP BY years, months, days
ORDER BY timeValue

Upvotes: 1

Related Questions