Reputation: 725
I have the following two dataframes and I would like to create a third df3 dataframe:
df1
Monday_One Monday_Two Tuesday
Water Water Ice
Cold Hot Hot Cold
0 4 1 1 10
1 5 2 2 9
2 6 3 7 8
3 7 3 4 7
4 8 5 5 6
df2
Area
Water Ice
Cold Hot Cold
0 4 1 10
1 5 2 9
2 6 3 8
3 7 4 7
4 8 5 6
df3
Monday_One Monday_Two Tuesday
Water Water Ice
Cold Hot Hot Cold
0 16 1 1 100
1 25 4 4 81
2 36 9 21 64
3 49 12 16 49
4 64 25 25 36
This is the code to generate it:
idx = pd.IndexSlice
data = {'Col1': [4, 5, 6, 7, 8], 'Col2': [1, 2, 3, 3, 5], 'Col3': [1, 2, 7, 4, 5], 'Col4': [10, 9, 8, 7, 6]}
col = pd.MultiIndex.from_tuples([('Monday_One', 'Water', 'Cold'), ('Monday_One', 'Water', 'Hot'),
('Monday_Two', 'Water', 'Hot'), ('Tuesday', 'Ice', 'Cold')])
df1 = pd.DataFrame(data)
df1.columns = col
data = {'Col1': [4, 5, 6, 7, 8], 'Col2': [1, 2, 3, 4, 5], 'Col3': [10, 9, 8, 7, 6]}
col = pd.MultiIndex.from_tuples([('Area', 'Water', 'Cold'), ('Area', 'Water', 'Hot'),
('Area', 'Ice', 'Cold')])
df2 = pd.DataFrame(data)
df2.columns = col
data = {'Col1': [0, 0, 0, 0, 0]}
col = pd.MultiIndex.from_tuples([('One', 'One', 'One')])
df3 = pd.DataFrame(data)
df3.columns = col
df3[('Monday_One', 'Water', 'Cold')] = df1[('Monday_One', 'Water', 'Cold')] * df2[('Area', 'Water', 'Cold')]
df3[('Monday_One', 'Water', 'Hot')] = df1[('Monday_One', 'Water', 'Hot')] * df2[('Area', 'Water', 'Hot')]
df3[('Monday_Two', 'Water', 'Hot')] = df1[('Monday_Two', 'Water', 'Hot')] * df2[('Area', 'Water', 'Hot')]
df3[('Tuesday', 'Ice', 'Cold')] = df1[('Tuesday', 'Ice', 'Cold')] * df2[('Area', 'Ice', 'Cold')]
df3 = df3.drop(df3.columns[0], axis=1)
The goal is to multiply every column in df1 with the respective column in df2, ignoring the first level of the multiindex.
Upvotes: 3
Views: 103
Reputation: 28644
Another option would be to find the commonalities between the two dataframe columns, multiply them and create a new dataframe with the result:
outcome = {ent: df1[ent] * df2.loc(axis=1)[entry].to_numpy()
for ent in df1.columns
for entry in df2.columns
if ent[1:] == entry[1:]})
pd.DataFrame(outcome)
Monday_One Monday_Two Tuesday
Water Water Ice
Cold Hot Hot Cold
0 16 1 1 100
1 25 4 4 81
2 36 9 21 64
3 49 12 16 49
4 64 25 25 36
Another option, which also uses a pairing to get the right combinations before multiplying and concatenation - note that this second option is slower than the first option, especially as the size of the data increases:
ix = pd.IndexSlice
slicers = [ix[:, "Water", "Hot"],
ix[:, "Water", "Cold"],
ix[:, "Ice", "Cold"]]
left = [df1.loc[:, entry]
for entry in slicers]
right = [df2.loc[:, entry].to_numpy()
for entry in slicers]
pd.concat([l * r for l, r in zip(left, right)], axis = 1)
Monday_One Monday_Two Monday_One Tuesday
Water Water Water Ice
Hot Hot Cold Cold
0 1 1 16 100
1 4 4 25 81
2 9 21 36 64
3 12 16 49 49
4 25 25 64 36
Upvotes: 0
Reputation: 862481
Use DataFrame.rename_axis
for set MultiIndex
names, so possible align DataFrames by DataFrame.align
with remove first level of df2
and then multiple:
df2 = df2.rename_axis(('a','b','c'), axis=1)
df1 = df1.rename_axis(('a','b','c'), axis=1)
df5, df6 = df2.droplevel(0, axis=1).align(df1)
print (df5)
b Ice Water
c Cold Cold Hot
a Tuesday Monday_One Monday_One Monday_Two
0 10 4 1 1
1 9 5 2 2
2 8 6 3 3
3 7 7 4 4
4 6 8 5 5
df = df1.mul(df5)
print (df)
a Monday_One Monday_Two Tuesday
b Water Water Ice
c Cold Hot Hot Cold
0 16 1 1 100
1 25 4 4 81
2 36 9 21 64
3 49 12 16 49
4 64 25 25 36
Another idea with multiple
with set columns names, for original order is added DataFrame.reorder_levels
with DataFrame.reindex
:
df2 = df2.rename_axis(('a','b','c'), axis=1)
df1 = df1.rename_axis(('a','b','c'), axis=1)
df = (df1.mul(df2.droplevel(0, axis=1))
.reorder_levels(['a','b','c'], axis=1)
.reindex(df1.columns, axis=1))
print (df)
a Monday_One Monday_Two Tuesday
b Water Water Ice
c Cold Hot Hot Cold
0 16 1 1 100
1 25 4 4 81
2 36 9 21 64
3 49 12 16 49
4 64 25 25 36
Upvotes: 3