bravepaperbag
bravepaperbag

Reputation: 21

Pandas: How to fill missing values grouped by two columns

I have a dataframe that looks like the following

Category Date   Value
   A      Jan     1
   A      Feb     1
   A      Mar     1
   B      Jan     1
   B      Feb     1
   C      Jan     1
   C      Mar     1

and I want to fill up the missing months for each category with a value of 0, ie.

Category Date   Value
   A      Jan     1
   A      Feb     1
   A      Mar     1
   B      Jan     1
   B      Feb     1
   B      Mar     0
   C      Jan     1
   C      Feb     0
   C      Mar     1

I'm not really sure where to start. Thanks in advance!

Upvotes: 1

Views: 756

Answers (3)

Kenan
Kenan

Reputation: 14094

Another approach

df.pivot_table(index=['Category'], columns='Date', values='Value').fillna(0).stack().reset_index()

  Category Date    0
0        A  Feb  1.0
1        A  Jan  1.0
2        A  Mar  1.0
3        B  Feb  1.0
4        B  Jan  1.0
5        B  Mar  0.0
6        C  Feb  0.0
7        C  Jan  1.0
8        C  Mar  1.0

Upvotes: 0

moys
moys

Reputation: 8033

You can unstack with fill_value=0 & stack to get your result

df.set_index(["Category","Date"]).unstack(fill_value=0).stack().reset_index()

Output

   Category     Date    Value
0   A            Feb    1
1   A            Jan    1
2   A            Mar    1
3   B            Feb    1
4   B            Jan    1
5   B            Mar    0
6   C            Feb    0
7   C            Jan    1
8   C            Mar    1

Upvotes: 3

Henry Yik
Henry Yik

Reputation: 22503

You can reindex with multiindex:

multi = [(x,y) for x in df["Category"].unique() for y in df["Date"].unique()]

print (df.set_index(["Category","Date"]).reindex(multi).fillna(0).reset_index())

  Category Date  Value
0        A  Jan    1.0
1        A  Feb    1.0
2        A  Mar    1.0
3        B  Jan    1.0
4        B  Feb    1.0
5        B  Mar    0.0
6        C  Jan    1.0
7        C  Feb    0.0
8        C  Mar    1.0

Upvotes: 1

Related Questions