Reputation: 510
I am looking for an answer to a question which I would have solved with for loops.
I have two pandas Dataframes:
ind_1 ind_2 ind_3
prod_id
A = a 1 0 0
a 0 1 0
b 0 1 0
c 0 0 1
a 0 0 1
a b c
B = ind_1 0.1 0.2 0.3
ind_2 0.4 0.5 0.6
ind_3 0.7 0.8 0.9
I am looking for a way to solve the following problem with pandas:
I want to map the entries of the dataframe B with a the index and columnnames and create a new column within dataframe A, so the result will look like this:
ind_1 ind_2 ind_3 y
prod_id
A = a 1 0 0 0.1
a 0 1 0 0.4
b 0 1 0 0.5
c 0 0 1 0.9
a 0 0 1 0.7
Is there a way to not use for loop to solve this problem?
Thank you in advance!
Upvotes: 0
Views: 63
Reputation: 862641
Use DataFrame.stack
for MultiIndex Series
in both DataFrame
s, then filter only 1
values by callable, filter b
values by Index.isin
, remove first level of MultiIndex
and last add new column - it is align by index values of A
:
a = A.T.stack().loc[lambda x: x == 1]
b = B.stack()
b = b[b.index.isin(a.index)].reset_index(level=0, drop=True)
A['y'] = b
print (A)
ind_1 ind_2 ind_3 y
prod_id
a 1 0 0 0.1
b 0 1 0 0.5
c 0 0 1 0.9
Or use DataFrame.join
with DataFrame.query
for filtering, but processing is a bit complicated:
a = A.stack()
b = B.stack()
s = (a.to_frame('a')
.rename_axis((None, None))
.join(b.swaplevel(1,0)
.rename('b'))
.query("a == 1")
.reset_index(level=1, drop=True))
A['y'] = s['b']
print (A)
ind_1 ind_2 ind_3 y
prod_id
a 1 0 0 0.1
b 0 1 0 0.5
c 0 0 1 0.9
Upvotes: 1