Reputation: 147
I have a specific task using PostgreSQL, which i need to come with a SQL query, but unfortunately, i'm not able to get a right result.
The example of the table i have:
----------------------------------------------
| ID | value | usage-date |
----------------------------------------------
| 1 | value1 | 2020-09-10 |
----------------------------------------------
| 1 | value1 | 2020-09-15 |
----------------------------------------------
| 1 | value1 | 2020-09-20 |
----------------------------------------------
| 1 | value1 | 2020-09-23 |
----------------------------------------------
| 1 | value1 | 2020-09-25 |
----------------------------------------------
| 1 | value1 | 2020-09-30 |
----------------------------------------------
| 1 | value2 | 2020-09-15 |
----------------------------------------------
| 1 | value2 | 2020-09-20 |
----------------------------------------------
| 1 | value2 | 2020-09-23 |
----------------------------------------------
| 1 | value2 | 2020-09-25 |
----------------------------------------------
So the table is ordered by ID, value and usage-date. My task is to extract intervals that should tell from which date to which date a certain value was active. So the output would look like following:
--------------------------------------------------------------
| ID | value | start-date | end-date |
--------------------------------------------------------------
| 1 | value1 | 2020-09-10 | 2020-09-15 |
--------------------------------------------------------------
| 1 | value2 | 2020-09-15 | 2020-09-20 |
--------------------------------------------------------------
| 1 | value1 | 2020-09-20 | 2020-09-23 |
--------------------------------------------------------------
| 1 | value2 | 2020-09-23 | 2020-09-25 |
--------------------------------------------------------------
| 1 | value1 | 2020-09-25 | 2020-09-30 |
--------------------------------------------------------------
Anyone some idea how is it possible to do it?
Here is an SQL fiddle so anyone can try.
Upvotes: 2
Views: 117
Reputation: 175
Your input does not coincide with the expected output , there is no clear distinction between value 1 and value2 interms of expected date range, it seems you just want to brute force the output.
Answer: Rather than making multiple entries for each value if you want a close range of dates for that value you can use the below query:
select distinct one1.id, vals.value1, one.minn as start_date, two.maxx as end_date from
table one1 inner join
(select value1 vals, min(user_date) as minn from table group by value1) one
on
one1.value = one.vals
inner join (
select value1 valm, max(user_date) as maxx from table group by value1) two
on one.vals = two.valm
Upvotes: 1
Reputation: 23666
step-by-step demo:db<>fiddle (click)
I guess, that two rows in your example are one for "start" and one for "end" of the activity. Because you didn't show any additional identifier, the query is more complex than it has to be. Because of this, we first have to remove the "end" rows manually.
SELECT
id,
value,
usagedate,
COALESCE( -- 4
lead(usagedate) OVER (PARTITION BY id ORDER BY usagedate), -- 3
first_value -- 4
)
FROM (
SELECT
*,
row_number() OVER (PARTITION BY id, value ORDER BY usagedate), -- 1
first_value(usagedate) OVER (PARTITION BY id ORDER BY usagedate DESC) -- 2
FROM
t
)s
WHERE row_number % 2 = 1 -- 1
row_number()
window functionWHERE
clause), we can use the lead()
window function to get the value of every ordered next row into the current one. So, we are able to use the start date of every new activity as end date of the current one. This is, why you, actually, don't need the "end" rows and we deleted them.lead()
function returns NULL
. To avoid this, we fetched the very last record (the end of the last activity) previously. With the COALESCE()
function this can be achieved. It takes the first not null value in its parameter list, which is the lead
value if it is not the last one, the first_value
if it isUpvotes: 1