Reputation: 1693
I have a Pandas dataframe with two indexes
Column1
indexA indexB
1001 aaa 1
bbb 1
ccc 1
1002 ddd 1
eee 1
and would like indexB
to have the same values for each value of indexA
:
Column1
indexA indexB
1001 aaa 1
bbb 1
ccc 1
ddd 0
eee 0
1002 aaa 0
bbb 0
ccc 0
ddd 1
eee 1
My first thought was to unstack, fillna with 0 and then stack it, but this seems like overkill. Is there an easier method?
EDIT: Alexander's answer below works though it takes a long time (my original dataframe has 350k rows). I changed that solution slightly:
df = pd.read_sql(sql=sql, con=db_eng, index_col=index)
idx = pd.MultiIndex.from_product([df.index.levels[0], df.index.levels[1]], names=df.index.names)
df.reindex(idx).fillna(value=0)
Also found these two questions after posting this:
Upvotes: 2
Views: 771
Reputation: 109706
There is probably a better way to do this. I created a new MultiIndex using pd.MultiIndex.from_product
. I then created a new dataframe with a dummy value, joined the existing dtaframe, and deleted the dummy column.
df = pd.DataFrame({'index_0': ['a', 'a', 'b', 'b', 'b'],
'index_1': ['A', 'B', 'A', 'B', 'C'],
'vals': [1, 2, 3, 4, 5]}).set_index(['index_0', 'index_1'])
>>> df
vals
index_0 index_1
a A 1
B 2
b A 3
B 4
C 5
idx = pd.MultiIndex.from_product([df.index.levels[0], df.index.levels[1]],
names=df.index.names)
new_df = pd.DataFrame({'_dummy_': [1] * len(idx)}, index=idx).join(df)
del new_df['_dummy_']
>>> new_df
vals
index_0 index_1
a A 1
B 2
C NaN
b A 3
B 4
C 5
Upvotes: 2