Reputation: 5384
I would like to return a set of unique records from a table based on two columns along with the most recent posting time and a total count of the number of times the combination of those two columns has appeared before (in time) the record of their output.
So what I'm trying to get is something along these lines:
select col1, col2, max_posted, count from T
join (
select col1, col2, max(posted) as posted from T where groupid = "XXX"
group by col1, col2) h
on ( T.col1 = h.col1 and
T.col2 = h.col2 and
T.max_posted = h.tposted)
where T.groupid = 'XXX'
Count needs to be the number of times EACH combination of col1 and col2 occurred BEFORE the max_posted of each record in the output. (I hope I explained that correctly :)
Edit: In trying the below suggestion as:
select dx.*,
count(*) over (partition by dx.cicd9, dx.cdesc order by dx.tposted) as cnt
from dx
join (
select cicd9, cdesc, max(tposted) as tposted from dx where groupid ="XXX"
group by cicd9, cdesc) h
on ( dx.cicd9 = h.cicd9 and
dx.cdesc = h.cdesc and
dx.tposted = h.tposted)
where groupid = 'XXX';
The count always returns '1'. Additionally, how would you count only the records that occurred before tposted
?
This also fails, but I hope you can get where I'm headed:
WITH H AS (
SELECT cicd9, cdesc, max(tposted) as tposted from dx where groupid = 'XXX'
group by cicd9, cdesc),
J AS (
SELECT count(*) as cnt
FROM dx, h
WHERE dx.cicd9 = h.cicd9
and dx.cdesc = h.cdesc
and dx.tposted <= h.tposted
and dx.groupid = 'XXX'
)
SELECT H.*,J.cnt
FROM H,J
Help anyone?
Upvotes: 2
Views: 2278
Reputation: 656596
This was confusing:
Count needs to be the number of times EACH combination of col1 and col2 occurred BEFORE the max_posted of each record in the output.
Since, by definition, every record is "before" (or at the same time as) the latest post, this essentially means the total count per combination (ignoring the assumed off-by-one error in the sentence).
So this burns down to a simple GROUP BY
:
SELECT cicd9, cdesc
, max(posted) AS last_posted
, count(*) AS ct
FROM dx
WHERE groupid = 'XXX'
GROUP BY 1, 2
ORDER BY 1, 2;
Which does exactly the same as the currently accepted answer. Just a lot faster and simpler.
Upvotes: 0
Reputation: 32199
How about this:
SELECT DISTINCT ON (cicd9, cdesc) cicd9, cdesc,
max(posted) OVER w AS last_post,
count(*) OVER w AS num_posts
FROM dx
WHERE groupid = 'XXX'
WINDOW w AS (
PARTITION BY cicd9, cdesc
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
);
Given the lack of PG version, table definition, data and desired output this is just shooting from the hip, but the principle should work: Make a partition on the two columns where groupid = 'XXX'
, then find the maximum value of the posted
column and the total number of rows in the window frame (hence the RANGE...
clause in the window definition).
Upvotes: 1
Reputation: 5384
This was the best I could come up with -- better suggestions are welcome!
This will produce the results I need, with the understanding that count will always be at least 1 (from the join):
SELECT dx.cicd9, dx.cdesc, max(dx.tposted), count(*)
from dx
join (
SELECT cicd9, cdesc, max(tposted) as tposted from dx where groupid = 'XXX'
group by cicd9, cdesc) h
on
(dx.cicd9 = h.cicd9 and dx.cdesc = h.cdesc and dx.tposted <= h.tposted
and dx.groupid = 'XXX')
group by dx.cicd9, dx.cdesc
order by dx.cdesc;
or
WITH H AS (
SELECT cicd9, cdesc, max(tposted) as tposted from dx where groupid = 'XXX'
group by cicd9, cdesc)
SELECT dx.cicd9, dx.cdesc, max(dx.tposted), count(*)
from dx, H
where dx.cicd9 = h.cicd9 and dx.cdesc = h.cdesc and dx.tposted <= h.tposted
and dx.groupid = 'XXX'
group by dx.cicd9, dx.cdesc
order by cdesc;
Upvotes: 0
Reputation: 1269693
Do you just want a cumulative count?
select t.*,
count(*) over (partition by col1, col2 order by posted) as cnt
from table t
where groupid = 'xxx';
Upvotes: 0