Victor Zhuang
Victor Zhuang

Reputation: 1

find next values satisfying a condition in SQL

Suppose that I have a dataframe as:

| ID      | Value | Time |
|---------|-------|------|
|   101   |  100  |   1  |
|   101   |   0   |   2  |
|   101   |  200  |   4  |
|   101   |  200  |   7  |
|   101   |   0   |  10  |
|   102   |  100  |   2  |
|   102   |   0   |   3  |
|   102   |  200  |   5  |

For each non-zero Value, I would like to find the next Time that Value=0 for the same ID. So my desired output will be

| ID      | Value | Time | NextTime |
|---------|-------|------|----------|
|   101   |  100  |   1  |     2    |
|   101   |   0   |   2  |   Null   |
|   101   |  200  |   4  |    10    |
|   101   |  200  |   7  |    10    |
|   101   |   0   |  10  |   Null   |
|   102   |  100  |   2  |     3    |
|   102   |   0   |   3  |   Null   |
|   102   |  200  |   5  |   Null   |

I have tried to use the following subquery:

SELECT *, CASE WHEN Value=0 THEN NULL ELSE (SELECT MIN(Time) FROM Table1  sub 
        WHERE sub.ID = main.ID AND sub.Time > main.Time AND sub.Value=0) END as NextTime
FROM Table1 AS main
 ORDER BY
  ID,
  Time

This query should work, but the problem is that I am working with a extremely large dataframe (millions records), so this query can not be finished in a reasonable time. Could any one help with a more efficient way to get the desired result? Thanks.

Upvotes: 0

Views: 607

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269683

You want a cumulative minimum:

select t.*,
       min(case when value = 0 then time end) over 
           (partition by id
            order by time
            rows between 1 following and unbounded following
           ) as next_0_time
from t;

EDIT:

If you want values on the 0 rows to be NULL, then use a case expression:

select t.*,
       (case when value <> 0 
             then min(case when value = 0 then time end) over 
                      (partition by id
                       order by time
                       rows between 1 following and unbounded following
                      )
        end) as next_0_time
from t;

Here is a db<>fiddle.

Upvotes: 1

Related Questions