Mauro Tamm
Mauro Tamm

Reputation: 360

SELECT rows without duplicates and WHERE IN ()

   SELECT formid, avals(data) as datavalues, akeys(data) as datakeys, taskid
      FROM task_form_data
      WHERE taskid IN (449750,449699,449620) 
      ORDER BY formid, timestamp DESC

Would be a test query -Problem is that the table (which i cant change) has newer rows added with new data - but formid remains same.

So when i select like that i get old data aswell. I cannot use DISTINCT ON (formid). I would need the newest results ( often 3-4 rows with diff formid) - for each of the taskid IN (comes from php - can be large number - cant do separate query for each). IS there any way to get it working like that?

Data example(just a quick example - larger date value would be oldest timestamp):

formid  timestamp  taskid
6       1          449750
2       2          449750  
2       3          449750
4       4          449750
4       5          449750

What should come out(number of various formid-s can be larger or smaller):

6       1          449750
2       2          449750  
4       4          449750

UPDATE:

SELECT DISTINCT ON(formid) formid, avals(data) as datavalues, akeys(data) as datakeys, taskid, timestamp
      FROM task_form_data
      WHERE taskid IN (450567,449699,449620)
      GROUP BY formid,taskid,data,timestamp
      ORDER BY formid,timestamp DESC

I tried that - it seems to work - but only with the first parameter in where taskid IN. Could it be modified to work with each value in the array?

Upvotes: 0

Views: 108

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1271151

You can do this with row_number(). For instance, if you want the three most recent rows:

SELECT formid, avals(data) as datavalues, akeys(data) as datakeys, taskid
FROM (SELECT d.*, row_number() over (partition by formid, taskid order by timestamp desc) as seqnum
      FROM task_form_data d
      WHERE taskid IN (449750, 449699, 449620)
     ) d
WHERE seqnum <= 3
ORDER BY taskid, formid, timestamp DESC;

Upvotes: 1

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125544

This matches you desired output. SQL Fiddle

select *
from (
    select *,
        row_number() over(partition by taskid order by timestamp) as rn
    from (
        select distinct on (taskid, formid) *
        from task_form_data
        where taskid in (449750,449699,449620)
        order by taskid, formid, timestamp
    ) s
) s
where rn <= 3
order by taskid, timestamp

With the real timestamp it is necessary to add desc everywhere it is used to order because a newer timestamp is bigger than an older one.

Upvotes: 0

user_0
user_0

Reputation: 3363

I don't really like this, but if your table is not huge it can work:

SELECT tfm.formid, tfm.avals(data) as datavalues, tfm.akeys(data) as datakeys, tfm.taskid
FROM task_form_data AS tfm
INNER JOIN
(
    SELECT formid, MAX(timestamp)
    FROM task_form_data
    WHERE taskid IN (449750,449699,449620)
    GROUP BY formid
) AS a
ON tfm.formid = a.formid
AND tfm.timestamp = a.timestamptimestamp
WHERE tfm.taskid IN (449750,449699,449620) 
ORDER BY tfm.formid, timestamp DESC;

Did you really called a field "timestamp"?

Upvotes: 0

Related Questions