Reputation: 21
I have a dataframe that looks like the following
Category Date Value
A Jan 1
A Feb 1
A Mar 1
B Jan 1
B Feb 1
C Jan 1
C Mar 1
and I want to fill up the missing months for each category with a value of 0, ie.
Category Date Value
A Jan 1
A Feb 1
A Mar 1
B Jan 1
B Feb 1
B Mar 0
C Jan 1
C Feb 0
C Mar 1
I'm not really sure where to start. Thanks in advance!
Upvotes: 1
Views: 756
Reputation: 14094
Another approach
df.pivot_table(index=['Category'], columns='Date', values='Value').fillna(0).stack().reset_index()
Category Date 0
0 A Feb 1.0
1 A Jan 1.0
2 A Mar 1.0
3 B Feb 1.0
4 B Jan 1.0
5 B Mar 0.0
6 C Feb 0.0
7 C Jan 1.0
8 C Mar 1.0
Upvotes: 0
Reputation: 8033
You can unstack
with fill_value=0
& stack
to get your result
df.set_index(["Category","Date"]).unstack(fill_value=0).stack().reset_index()
Output
Category Date Value
0 A Feb 1
1 A Jan 1
2 A Mar 1
3 B Feb 1
4 B Jan 1
5 B Mar 0
6 C Feb 0
7 C Jan 1
8 C Mar 1
Upvotes: 3
Reputation: 22503
You can reindex
with multiindex:
multi = [(x,y) for x in df["Category"].unique() for y in df["Date"].unique()]
print (df.set_index(["Category","Date"]).reindex(multi).fillna(0).reset_index())
Category Date Value
0 A Jan 1.0
1 A Feb 1.0
2 A Mar 1.0
3 B Jan 1.0
4 B Feb 1.0
5 B Mar 0.0
6 C Jan 1.0
7 C Feb 0.0
8 C Mar 1.0
Upvotes: 1