Reputation: 1063
I have a very large dataframe where one column has a set of names in it. Unfortunately, the same column also has some other data there. What I want to do is to find all the rows that contain the names, look for those in another column, get the max value from a third column. Then, I need to place that value in the same row as the name.
I'm currently using a for
loop, which does exactly what I want, but it is extremely slow. I've tried fooling around with lookup
but can't get it to work. Just bear in mind that the real dataframe has something like 20,000 rows!
import pandas as pd
import numpy as np
friends = ['Mickey', 'Minnie', 'Goofy', 'Donald', 'Daisy', 'Pluto']
df_data = {'Name':['Mickey', 'Ravens', 'Vikings',
'Minnie', 'Packers', 'Browns',
'Goofy', 'Cowboys', 'Steelers',
'Donald', '49ers', 'Bears',
'Daisy', 'Chiefs', 'Raiders',
'Pluto', 'Patriots', 'Colts'],
'Owner': [np.nan, 'Mickey', 'Mickey',
np.nan, 'Minnie', 'Minnie',
np.nan, 'Goofy', 'Goofy',
np.nan, 'Donald', 'Donald',
np.nan, 'Daisy', 'Daisy',
np.nan, 'Pluto', 'Pluto'],
'Wins': [np.nan, 14, 10,
np.nan, 13, 6,
np.nan, 8, 8,
np.nan, 13, 8,
np.nan, 12, 7,
np.nan, 12, 7]}
df = pd.DataFrame(data = df_data)
for f in friends:
df.loc[df['Name'] == f, 'Wins'] = df.loc[df['Owner'] == f, 'Wins'].max()
print(df.to_string())
Upvotes: 2
Views: 4199
Reputation: 78690
IIUC, you can use
df['Wins'].update(df['Name'].map(df.groupby('Owner')['Wins'].max()))
Upvotes: 2
Reputation: 71689
Use, DataFrame.groupby
on Owner
then use the agg
function max
to create a mapping series s
, Now using the series s
update the values in Wins
column in df
using Series.update
:
s = df.groupby('Owner')['Wins'].max()
df = df.set_index('Name')
df['Wins'].update(s)
df = df.reset_index()
# print(df)
Name Owner Wins
0 Mickey NaN 14.0
1 Ravens Mickey 14.0
2 Vikings Mickey 10.0
3 Minnie NaN 13.0
4 Packers Minnie 13.0
5 Browns Minnie 6.0
6 Goofy NaN 8.0
7 Cowboys Goofy 8.0
8 Steelers Goofy 8.0
9 Donald NaN 13.0
10 49ers Donald 13.0
11 Bears Donald 8.0
12 Daisy NaN 12.0
13 Chiefs Daisy 12.0
14 Raiders Daisy 7.0
15 Pluto NaN 12.0
16 Patriots Pluto 12.0
17 Colts Pluto 7.0
Upvotes: 1