Reputation: 354
For instance we have rows like
in | out |
---|---|
2020-01-01 | 2020-03-03 |
and I need result with a list of months between, i.e.:
in | out | between |
---|---|---|
2020-01-01 | 2020-03-03 | ["2020-01-01", "2020-02-01", "2020-03-01"] |
to count then all months by some grouping.
There can be any day, but with kept months.
I try to reach this via combination of explode/posexplode, months_between, add_months, CTE, but have a lack of brain yet. One of:
select
*,
case
when cast(
ceil(abs(months_between(test.in, test.out))) as int
) > 1 then split(
repeat(
concat(add_months(test.in, 1), ','),
cast(
ceil(abs(months_between(test.in, test.out))) as int
)
),
','
)
else array(test.in)
end btw
from test;
in | out | btw |
---|---|---|
2020-01-01 | 2020-03-03 | ["2020-02-01", "2020-02-01", "2020-02-01", ""] |
I can't figure out how to increment months inside query with months_between
. I believe there is some way, but I need a pointer.
Upvotes: 1
Views: 607
Reputation: 38290
Use lateral view posexplode to get rows with position numbers, add position to the IN date using add_months, collect_list to get an array, concatenate array using concat_ws if necessary.
Demo:
WITH test AS (
SELECT '2020-01-06' dt_in, '2020-03-03' dt_out
)
SELECT dt_in, dt_out,
COLLECT_LIST(DATE_FORMAT(ADD_MONTHS(t.dt_in, pos), 'yyyy-MM-01')) AS `between`
FROM test t
LATERAL VIEW OUTER POSEXPLODE(
SPLIT(
SPACE(
CAST(CEIL(ABS(MONTHS_BETWEEN(DATE_FORMAT(t.dt_in, 'yyyy-MM-01'), t.dt_out))) AS INT) - 1
),
' '
)
) e AS pos, x
GROUP BY dt_in, dt_out;
Result:
dt_in dt_out between
2020-01-01 2020-03-03 ["2020-01-01","2020-02-01","2020-03-01"]
Upvotes: 1