user6064424
user6064424

Reputation:

Retrieve last non-null record of every column for each record_id in MySQL

I have this MySQL table named records. Below is its contents.

id  record_id   Data1   Data2   Time 
1   1           null    1       1/1/16
2   1           1       null    1/3/16
3   1           2       null    1/4/16
4   1           null    3       1/5/16
5   2           1       null    2/1/16
6   2           1       null    2/3/16
7   2           7       null    2/4/16
8   2           null    5       2/5/16

I would like to have a MySQL query to retrieve the last non-null record of each column for each record_id. The result would look something like;

record_id   Data1   Data2   Time 
1           2       3       1/5/16
2           7       5       2/5/16

The tricky part to this problem is that multiple columns are involved.

Upvotes: 4

Views: 1349

Answers (4)

Felypp Oliveira
Felypp Oliveira

Reputation: 2197

This one may solve your problem:

select 
  record_id,
  substring_index(group_concat(Data1 order by Time desc), ',', 1) Data1,
  substring_index(group_concat(Data2 order by Time desc), ',', 1) Data2,
  substring_index(group_concat(Time  order by Time desc), ',', 1) Time
from records
group by record_id
;

It may not be as fast as other answers, but is another version... give it a try. If you have a Data3 column in your table, you can copy/paste the Data1 column and just change all references of this column to the new one.

Just to explain how this works: the group_concat function concatenates all non-null values of a column with a separator (, by default). You can order the column before the concatenation. It works a bit like a window function in Oracle, Postgre, and others... The substring_index is just getting the first concatenated value, as the list is in a descending order of time.

Upvotes: 1

sstan
sstan

Reputation: 36503

I would like to have a MySQL query to retrieve the last non-null record of each column for each record_id.

Of course, what is still somewhat unclear is how you determine that a row is the last row, since rows in a database are by definition unordered.

So, my interpretation is that you want the last non-null Data1, Data2 and Time column values for each distinct record_id value. And a row value is considered last if it has a higher id value than another row value.

Assuming my understanding is correct, the following query would work:

select t.record_id,
       (select t2.Data1 
          from tbl t2 
         where t2.record_id = t.record_id 
           and t2.Data1 is not null
          order by t2.id desc
          limit 1) as Data1,
       (select t2.Data2
          from tbl t2 
         where t2.record_id = t.record_id 
           and t2.Data2 is not null
          order by t2.id desc
          limit 1) as Data2,
       (select t2.Time
          from tbl t2 
         where t2.record_id = t.record_id 
           and t2.Time is not null
          order by t2.id desc
          limit 1) as Time
  from tbl t
group by t.record_id
order by t.record_id

SQLFiddle Demo

Upvotes: 1

Matt
Matt

Reputation: 14341

it looks like you are just wanting the maximum data1, max data2, and max time which would be simple aggregation:

SELECT
  record_id
  ,MAX(Data1) as Data1
  ,MAX(Data2) as Data2
  ,MAX(Time) as Time
FROM
  yourTable
GROUP BY
  record_id

SQL fiddle for it http://www.sqlfiddle.com/#!9/d95bc1/2

If latest non-null value per column is desired you can use:

SELECT t.record_id, MAX(t.Data1) as Data1, MAX(t.Data2) as Data2, MAX(t.Time) as Time
FROM
  yourTable t
  LEFT JOIN
  (
    SELECT
      record_id, MAX(Time) as MaxTime
    FROM
      yourTable t
    WHERE
      Data1 IS NOT NULL
    GROUP BY
      record_id
  ) d1
  ON t.record_id = d1.record_id
  AND t.Time = d1.MaxTime
  LEFT JOIN
  (
    SELECT
      record_id, MAX(Time) as MaxTime
    FROM
      yourTable t
    WHERE
      Data2 IS NOT NULL
    GROUP BY
      record_id  
   ) d2
  ON t.record_id = d2.record_id
  AND t.Time = d2.MaxTime
WHERE
  d1.record_id IS NOT NULL
  OR d2.record_id IS NOT NULL
GROUP BY
  t.record_id

Using Tim's method you can actually still get to your results looking at the Latest Data1 record and then the latest Data2 record and then aggregating so they are not purely the MAX of everything but rather representative of the latest 2 records 1 for Data1 and 1 for Data2.

SQL fiddle for this part: http://www.sqlfiddle.com/#!9/d95bc1/10

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521674

SELECT t1.*
FROM yourTable t1
INNER JOIN
(
    SELECT record_id, MAX(Time) AS Time
    FROM yourTable
    GROUP BY record_id
) t2
    ON t1.record_id = t2.record_id AND
       t1.Time = t2.Time

If you simply want the greatest value for the data and time columns, then see the answer given by @Matt. But your language makes it unclear what you really want.

Update:

Something like this might give the results you want:

SELECT a.record_id,
       a.Data1,
       b.Data2,
       c.Time
FROM
(
    SELECT t1.record_id,
           t1.Data1
    FROM yourTable t1
    INNER JOIN
    (
        SELECT record_id,
               MAX(CASE WHEN Data1 IS NULL THEN 0 ELSE id END) AS Data1Id
        FROM yourTable
        GROUP BY record_id
    ) t2
        ON t1.record_id = t2.record_id AND
           t1.Id = t2.Data1Id
) a
INNER JOIN
(
    SELECT t1.record_id,
           t1.Data2
    FROM yourTable t1
    INNER JOIN
    (
        SELECT record_id,
               MAX(CASE WHEN Data2 IS NULL THEN 0 ELSE id END) AS Data2Id
        FROM yourTable
        GROUP BY record_id
    ) t2
        ON t1.record_id = t2.record_id AND
           t1.Id = t2.Data2Id
) b
    ON a.record_id = b.record_id
INNER JOIN
(
    SELECT t1.record_id,
           t1.Time
    FROM yourTable t1
    INNER JOIN
    (
        SELECT record_id,
               MAX(CASE WHEN Data2 IS NULL THEN 0 ELSE id END) AS TimeId
        FROM yourTable
        GROUP BY record_id
    ) t2
        ON t1.record_id = t2.record_id AND
           t1.Id = t2.TimeId
) c
    ON a.record_id = c.record_id

Demo Here:

SQLFiddle

Upvotes: 3

Related Questions