Reputation: 23
I am choosing the values in Pandas DataFrame.
I would like to choose the values in the columns 'One_T','Two_T','Three_T'(which means the total counts), based on the Ratios of the columns('One_R','Two_R','Three_R').
Comparing values is done by the columns('One_R','Two_R','Three_R') and choosing values will be done by columns ('One_T','Two_T','Three_T').
I would like to find the highest values among columns('One_R','Two_R','Three_R') and put values from columns 'One_T','Two_T','Three_T' in new column 'Highest'.
For example, the first row has the highest values in One_R than Two_R and Three_R. Then, the values in One_T will be filled the column named Highest.
The initial data frame is test below code and the desired result is the result in the below code.
test = pd.DataFrame([[150,30,140,20,120,19],[170,31,130,30,180,22],[230,45,100,50,140,40],
[140,28,80,10,60,10],[100,25,80,27,50,23]], index=['2019-01-01','2019-02-01','2019-03-01','2019-04-01','2019-05-01'],
columns=['One_T','One_R','Two_T','Two_R','Three_T','Three_R'])
One_T One_R Two_T Two_R Three_T Three_R
2019-01-01 150 30 140 20 120 19
2019-02-01 170 31 130 30 180 22
2019-03-01 230 45 100 50 140 40
2019-04-01 140 28 80 10 60 10
2019-05-01 100 25 80 27 50 23
result = pd.DataFrame([[150,30,140,20,120,19,150],[170,31,130,30,180,22,170],[230,45,100,50,140,40,100],
[140,28,80,10,60,10,140],[100,25,80,27,50,23,80]], index=['2019-01-01','2019-02-01','2019-03-01','2019-04-01','2019-05-01'],
columns=['One_T','One_R','Two_T','Two_R','Three_T','Three_R','Highest'])
One_T One_R Two_T Two_R Three_T Three_R Highest
2019-01-01 150 30 140 20 120 19 150
2019-02-01 170 31 130 30 180 22 170
2019-03-01 230 45 100 50 140 40 100
2019-04-01 140 28 80 10 60 10 140
2019-05-01 100 25 80 27 50 23 80
Is there any way to do this?
Thank you for time and considerations.
Upvotes: 2
Views: 43
Reputation: 75080
You can solve this using df.filter
to select columns with the _R
suffix, then idxmax
. Then replace _R
with _T
and use df.lookup
:
s = test.filter(like='_R').idxmax(1).str.replace('_R','_T')
test['Highest'] = test.lookup(s.index,s)
print(test)
One_T One_R Two_T Two_R Three_T Three_R Highest
2019-01-01 150 30 140 20 120 19 150
2019-02-01 170 31 130 30 180 22 170
2019-03-01 230 45 100 50 140 40 100
2019-04-01 140 28 80 10 60 10 140
2019-05-01 100 25 80 27 50 23 80
Upvotes: 2