Pandas - expand dataframe using daterange

Question

I have the following dataframe:

 name    from       amount   days
 A       7/31/18    200      1
 B       7/31/18    300      1
 C       7/30/18    200      1
 D       7/27/18    100      3
 ......
 G       7/17/18    50       1
 H       7/13/18    150      4

I'd like to expand it to this, where days does not equal 1:

 name    from       amount   days
 A       7/31/18    200      1
 B       7/31/18    300      1
 C       7/30/18    200      1
 D       7/29/18    100      3
 D       7/28/18    100      3
 D       7/27/18    100      3
 ......
 G       7/17/18    50       1
 H       7/16/18    150      4
 H       7/15/18    150      4
 H       7/14/18    150      4
 H       7/13/18    150      4

If possible, I'd also like to add a column that can distinguish between the original data and expanded data (since I'm going to need to filter some dates eventually):

 name    from       amount   days   original
 A       7/31/18    200      1      1
 B       7/31/18    300      1      1
 C       7/30/18    200      1      1
 D       7/29/18    100      3      0
 D       7/28/18    100      3      0
 D       7/27/18    100      3      1
 ......
 G       7/17/18    50       1      1
 H       7/16/18    150      4      0
 H       7/15/18    150      4      0
 H       7/14/18    150      4      0
 H       7/13/18    150      4      1

Edit: To clarify the expansion: Days will tell you how many rows it needs to be expanded to. Alternatively, you can use the date above the original value as a boundary (the entry 7/27 where days=3 will stop at the above value where the date is 7/30. The data has constraints to make sure it never overlaps).

BENY · Accepted Answer

About two steps create the dataframe(reindex) , and adjust the values (duplicated)

newdf=df.reindex(df.index.repeat(df.days)) # create the df using reindex
adddate=pd.Series(np.concatenate(df.days.apply(np.arange).values),index=newdf.index)# create the timedelta to add 
newdf['from']=pd.to_datetime(newdf['from'])+pd.to_timedelta(adddate,unit='d')# assign the value 
newdf['original']=(~newdf.index.duplicated()).astype(int)
newdf
Out[240]: 
  name       from  amount  days  original
0    A 2018-07-31     200     1         1
1    B 2018-07-31     300     1         1
2    C 2018-07-30     200     1         1
3    D 2018-07-27     100     3         1
3    D 2018-07-28     100     3         0
3    D 2018-07-29     100     3         0

Pandas - expand dataframe using daterange

Answers (2)

Comprehension

Helper Functions

Related Questions