gjin
gjin

Reputation: 929

inserting into table closing rows while keeping the date column

Following this question.

My table

id    sum    type    date 
1     3      -1      2017-02-02
1     6      -1      2017-02-04
1     -6      2      2017-02-01
1     -3      1      2017-02-09
1     3      -1      2017-02-17
1     6      -1      2017-02-05

This query finds people who pass the conditions and returns an occurrences number of rows of those users with some columns modified.

with t as(
select id
  , -abs (sum) as sum   
  , sum (case when type = -1 then 1 else -1 end) as occurrences
  --, collect_list(date) as time_col
from table
group by id, abs(sum)
having sum (case when type = -1 then 1 else -1 end) > 15
)
select t.id
  , t.sum  
  , 2 as type
from t 
lateral view explode (split (space (cast (occurrences as int) - 1),' ')) e
-- lateral view explode(time_col) time_table as time_key;

The problem is, I need every row to hold one date column from the list. I tried adding , collect_list(date) as time_col and then

lateral view explode(time_col) time_table as time_key;

but this just returned all possible combinations. I could probably use a join(would that work?), but I wondered if that's really necessary.

In the end these rows

1     3      -1      2017-02-17
1     6      -1      2017-02-05

would transform into

1     -3      2    2017-02-17
1     -6      2    2017-02-05

Upvotes: 1

Views: 51

Answers (1)

David דודו Markovitz
David דודו Markovitz

Reputation: 44941

select      val_id
           ,-val_sum    as val_sum
           ,2           as val_type
           ,val_date

from       (select      val_id
                       ,val_sum
                       ,val_type
                       ,val_date

                       ,sum (case when val_type = -1 then 1 else -1 end) over   
                        (
                            partition by    val_id,-abs (val_sum) 
                        ) as occurrences

                       ,row_number () over
                        (
                            partition by    val_id,val_sum 
                            order by        val_date desc
                        )   as rn

            from        mytable
            ) t

where       val_type     =  -1
        and rn          <=  occurrences
        and occurrences >   15
;

Execution results (without and occurrences > 15)

+--------+---------+----------+------------+
| val_id | val_sum | val_type | val_date   |
+--------+---------+----------+------------+
| 1      | -3      | 2        | 2017-02-17 |
+--------+---------+----------+------------+
| 1      | -6      | 2        | 2017-02-05 |
+--------+---------+----------+------------+

Upvotes: 1

Related Questions