BoyKislot
BoyKislot

Reputation: 43

How to insert row data between consecutive dates in HIVE?

Sample Data:

 customer    txn_date    tag
    A           1-Jan-17    1   
    A           2-Jan-17    1 
    A           4-Jan-17    1 
    A           5-Jan-17    0         
    B           3-Jan-17    1
    B           5-Jan-17    0

Need to fill every missing txn_date between date range (1-Jan-17 to 5-Jan-2017). Just like below:

Output should be:

customer    txn_date    tag
A           1-Jan-17    1   
A           2-Jan-17    1 
A           3-Jan-17    0 (inserted)
A           4-Jan-17    1 
A           5-Jan-17    0  
B           1-Jan-17    0 (inserted)
B           2-Jan-17    0 (inserted)
B           3-Jan-17    1
B           4-Jan-17    0 (inserted)
B           5-Jan-17    0

Upvotes: 2

Views: 1812

Answers (2)

David דודו Markovitz
David דודו Markovitz

Reputation: 44941

select  c.customer
       ,d.txn_date
       ,coalesce(t.tag,0) as tag       

from   (select date_add (from_date,i)   as txn_date

        from   (select  date '2017-01-01'   as from_date
                       ,date '2017-01-05'   as to_date
                ) p

                lateral view 
                posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
        ) d

        cross join (select  distinct 
                            customer 

                    from    t
                    ) c

        left join   t

        on          t.customer  = c.customer
                and t.txn_date  = d.txn_date
;                

c.customer  d.txn_date  tag
A   2017-01-01  1
A   2017-01-02  1
A   2017-01-03  0
A   2017-01-04  1
A   2017-01-05  0
B   2017-01-01  0
B   2017-01-02  0
B   2017-01-03  1
B   2017-01-04  0
B   2017-01-05  0

Upvotes: 6

K S Nidhin
K S Nidhin

Reputation: 2650

Just have the delta content i.e the missing data in a file(input.txt) delimited with the same delimiter you have mentioned when you created the table.

Then use the load data command to insert this records into the table.

load data local inpath '/tmp/input.txt' into table tablename;

Your data wont be in the order you have mentioned , it would get appended to the last. You could retrieve the order by adding order by txn_date in the select query.

Upvotes: 0

Related Questions