Youssef Razak
Youssef Razak

Reputation: 375

Add missing days in a dataframe

I need to fill missing days in the column 'day':

    id  month   day trans
0   0     8     1   9
1   0     8     2   5
2   0     8     3   10
3   0     8     4   6
4   0     8     6   4
5   0     8     8   4

I am looking for output:

    id  month   day trans
0   0     8     1   9
1   0     8     2   5
2   0     8     3   10
3   0     8     4   6
4   0     8     5   NAN
5   0     8     6   4
6   0     8     7   NAN
7   0     8     8   4

Upvotes: 2

Views: 86

Answers (3)

Ferris
Ferris

Reputation: 5611

Using pandas upsampling.

df['date'] = df.apply(lambda x: datetime(2020, x['month'], x['day']), axis=1)
df = df.set_index('date')
# Upsampling
df_daily = df.resample('D').asfreq().reset_index()

# reassign month and day
df_daily['month'] = df_daily.date.dt.month
df_daily['day'] = df_daily.date.dt.day
df_daily['id'] = df_daily['id'].fillna(method='ffill').astype(int)
del df_daily['date']

Upvotes: 0

wwnde
wwnde

Reputation: 26686

Use reindex()

df1=df.set_index('day').reindex([1,2,3,4,5,6,7]).reset_index()
df1[['month','id']]=df1[['month','id']].ffill()

Following your comment;

  mux = pd.MultiIndex.from_product([df['id'].unique(),[1,2,3,4,5,6,7]], names=['id','day'])
df1=df.set_index(['id','day']).reindex(mux).reset_index()   
df1[['month','id']]=df1[['month','id']].ffill()



id  day  month  #trans
0   0    1    8.0     9.0
1   0    2    8.0     5.0
2   0    3    8.0    10.0
3   0    4    8.0     6.0
4   0    5    8.0     NaN
5   0    6    8.0     4.0
6   0    7    8.0     NaN

Upvotes: 1

Ludovic H
Ludovic H

Reputation: 56

I think the best way to deal with it is building a pandas df that has all the [month, day] values of your output, and left merging your first df on [id, month, day] key.

Upvotes: 0

Related Questions