How to join a table with each group of a dataframe in pandas

Question

I have a dataframe like below. Each date is Monday of each week.

df = pd.DataFrame({'date' :['2020-04-20', '2020-05-11','2020-05-18',
                                 '2020-04-20', '2020-04-27','2020-05-04','2020-05-18'],
                         'name': ['A', 'A', 'A', 'B', 'B', 'B', 'B'], 
                          'count': [23, 44, 125, 6, 9, 10, 122]})

    date      name  count
0   2020-04-20  A   23
1   2020-05-11  A   44
2   2020-05-18  A   125
3   2020-04-20  B   6
4   2020-04-27  B   9
5   2020-05-04  B   10
6   2020-05-18  B   122

Neither 'A' and 'B' covers the whole date range. Both of them have some missing dates, which means the counts on that week is 0. Below is all the dates:

df_dates = pd.DataFrame({ 'date':['2020-04-20', '2020-04-27','2020-05-04','2020-05-11','2020-05-18'] })

So what I need is like the dataframe below:

    date      name  count
0   2020-04-20  A   23
1   2020-04-27  A   0
2   2020-05-04  A   0
3   2020-05-11  A   44
4   2020-05-18  A   125
5   2020-04-20  B   6
6   2020-04-27  B   9
7   2020-05-04  B   10
8   2020-05-11  B   0
9   2020-05-18  B   122

It seems like I need to join (merge) df_dates with df for each name group ( A and B) and then fill the data with missing name and missing count value with 0's. Does anyone know achieve that? how I can join with another table with a grouped table?

I tried and no luck...

pd.merge(df_dates, df.groupby('name'), how='left', on='date')

BENY · Accepted Answer

We can do reindex with multiple index creation

idx=pd.MultiIndex.from_product([df_dates.date,df.name.unique()],names=['date','name'])  
s=df.set_index(['date','name']).reindex(idx,fill_value=0).reset_index().sort_values('name')
Out[136]: 
         date name  count
0  2020-04-20    A     23
2  2020-04-27    A      0
4  2020-05-04    A      0
6  2020-05-11    A     44
8  2020-05-18    A    125
1  2020-04-20    B      6
3  2020-04-27    B      9
5  2020-05-04    B     10
7  2020-05-11    B      0
9  2020-05-18    B    122

Or

s=df.pivot(*df.columns).reindex(df_dates.date).fillna(0).reset_index().melt('date')
Out[145]: 
         date name  value
0  2020-04-20    A   23.0
1  2020-04-27    A    0.0
2  2020-05-04    A    0.0
3  2020-05-11    A   44.0
4  2020-05-18    A  125.0
5  2020-04-20    B    6.0
6  2020-04-27    B    9.0
7  2020-05-04    B   10.0
8  2020-05-11    B    0.0
9  2020-05-18    B  122.0

How to join a table with each group of a dataframe in pandas

Answers (2)

Related Questions