qfd
qfd

Reputation: 788

add missing data and dates in dataframe

I have the following dataframe

    A   B   C    D     date 
   AU  AT   0.9  0.7   3/31/1960
   AU  AT   0.3  0.6   6/30/1960 
   AT  AU   0.7  0.5   4/30/1960
   AT  AU   0.65 0.4   6/30/1960

As an example, lets say my minimum start date is 3/31/1960 and my max is 6/30/1960

I would like to forward fill column C and D, based on data grouped by A, B and date. So my final would data frame would be

    A   B   C    D      date
    AU  AT  0.9  0.7    3/31/1960
    AU  AT  0.9  0.7    4/30/1960
    AU  AT  0.9  0.7    5/31/1960
    AU  AT  0.3  0.6    6/30/1960
    AT  AU  0.7  0.5    4/30/1960
    AT  AU  0.7  0.5    5/31/1960
    AT  AU  0.65 0.4    6/30/1960

any help is appreciated many thanks

Upvotes: 0

Views: 44

Answers (1)

BENY
BENY

Reputation: 323226

You can do with reindex within groupby

df.date=pd.to_datetime(df.date)
df
Out[85]:
    A   B     C    D       date
0  AU  AT  0.90  0.7 1960-03-31
1  AU  AT  0.30  0.6 1960-06-30
2  AT  AU  0.70  0.5 1960-04-30
3  AT  AU  0.65  0.4 1960-06-30

df.groupby('A').apply(lambda x : x.set_index(['date']).reindex(pd.date_range(x['date'].min(),x['date'].max(),freq='m')).ffill())
Out[91]: 
                A   B     C    D
A                               
AT 1960-04-30  AT  AU  0.70  0.5
   1960-05-31  AT  AU  0.70  0.5
   1960-06-30  AT  AU  0.65  0.4
AU 1960-03-31  AU  AT  0.90  0.7
   1960-04-30  AU  AT  0.90  0.7
   1960-05-31  AU  AT  0.90  0.7
   1960-06-30  AU  AT  0.30  0.6 

Upvotes: 2

Related Questions