Reputation: 11
It's been a long night searching for a solution, I appreciate your help.
Having the following df
proposal1_amount | proposal2_amount | proposal3_amount | accepted_proposal |
---|---|---|---|
1000 | 2000 | 3000 | 3 |
5000 | 5200 | 4000 | 2 |
3000 | 2400 | 1120 | 1 |
I need to build a new column with the amount coming from the accepted corresponding column, it would be like this:
proposal1_amount | proposal2_amount | proposal3_amount | accepted_proposal | accepted_amount |
---|---|---|---|---|
1000 | 2000 | 3000 | 3 | 3000 |
5000 | 5200 | 4000 | 2 | 5200 |
1450 | 2400 | 1120 | 1 | 1450 |
I've found some examples which work fine when the new column has a fixed value, but in this case the value comes from another column on the same df.
thanks, vv
Upvotes: 1
Views: 59
Reputation: 4263
proposal1_amount=[1000,5000,1450]
proposal2_amount=[2000,5200,2400]
proposal3_amount=[3000,4000,1120]
accepted_proposal=[3,2,1]
df=pd.DataFrame({'proposal1_amount': proposal1_amount,'proposal2_amount': proposal2_amount,'proposal3_amount':proposal3_amount,'accepted_proposal':accepted_proposal})
df['accepted_proposal']=df['accepted_proposal'].astype(int)
df=df.assign(accepted_amount=df.apply(lambda row: row.iloc[row['accepted_proposal']-1], axis=1))
print(df)
output:
proposal1_amount proposal2_amount proposal3_amount accepted_proposal
0 1000 2000 3000 3
1 5000 5200 4000 2
2 1450 2400 1120 1
accepted_amount
0 3000
1 5200
2 1450
Upvotes: 0
Reputation: 774
Quickest solution I could think of:
df['accepted_amount'] = df.apply(lambda row: row.iloc[row['accepted_proposal']-1],axis=1)
Edit: Because I feel un-easy about the solution being contingent upon the ordering of the columns, here's a slightly wordier yet more dynamic solution:
df['accepted_amount']=df.apply(lambda row: row[['proposal1_amount','proposal2_amount','proposal3_amount']].iloc[row['accepted_proposal']-1],axis=1)
Upvotes: 3
Reputation: 13537
You can use numpy.choose
to do this pretty easily.
print(df)
proposal1_amount proposal2_amount proposal3_amount accepted_proposal
0 1000 2000 3000 3
1 5000 5200 4000 2
2 3000 2400 1120 1
# create 2d array of our choices (which corresponds to our amounts)
choices = df.filter(regex="proposal\d_amount").to_numpy()
# subtract 1 from "accepted_proposal" so they line up with indices in choices array
# (we want these 0-indexed, not 1-indexed)
a = df["accepted_proposal"] - 1
# np.choose does all the heavy lifting, assign output to new column
df["accepted_amount"] = np.choose(a, choices)
print(df)
proposal1_amount proposal2_amount proposal3_amount accepted_proposal accepted_amount
0 1000 2000 3000 3 3000
1 5000 5200 4000 2 5200
2 3000 2400 1120 1 3000
np.choose
will functionally iterate over each row of choices (e.g. iterate over each "proposalN_amount") and then take the amount that matches the index from accepted_proposal - 1
. See the docs for np.choose
Upvotes: 0