Reputation: 481
Let's say I have a dataframe df:
df = pd.DataFrame({'col1': [1,1,2,2,2], 'col2': ['A','B','A','B','C'], 'value': [2,4,6,8,10]})
col1 col2 value
0 1 A 2
1 1 B 4
2 2 A 6
3 2 B 8
4 2 C 10
I'm looking for a way to create any missing rows among the possible combination of col1
and col2
with exiting values, and fill in the missing rows with zeros
The desired result would be:
col1 col2 value
0 1 A 2
1 1 B 4
2 2 A 6
3 2 B 8
4 2 C 10
5 1 C 0 <- Missing the "1-C" combination, so create it w/ value = 0
I've looked into using stack
and unstack
to make this work, but I'm not sure that's exactly what I need.
Thanks in advance
Upvotes: 6
Views: 1497
Reputation: 25269
Another way using unstack
with fill_value=0
and stack
, reset_index
df.set_index(['col1','col2']).unstack(fill_value=0).stack().reset_index()
Out[311]:
col1 col2 value
0 1 A 2
1 1 B 4
2 1 C 0
3 2 A 6
4 2 B 8
5 2 C 10
Upvotes: 5
Reputation: 61920
You could use reindex + MultiIndex.from_product:
index = pd.MultiIndex.from_product([df.col1.unique(), df.col2.unique()])
result = df.set_index(['col1', 'col2']).reindex(index, fill_value=0).reset_index()
print(result)
Output
col1 col2 value
0 1 A 2
1 1 B 4
2 1 C 0
3 2 A 6
4 2 B 8
5 2 C 10
Upvotes: 1
Reputation: 323356
Use pivot
, then stack
df.pivot(*df.columns).fillna(0).stack().to_frame('values').reset_index()
Out[564]:
col1 col2 values
0 1 A 2.0
1 1 B 4.0
2 1 C 0.0
3 2 A 6.0
4 2 B 8.0
5 2 C 10.0
Upvotes: 7