Reputation: 679
i'm having a trouble when i wanna replace specific values that satisfies a condition and replace the values based on another condition.
Gender Surname Ticket
` 0 masc Family1 a12`
` 1 **fem NoGroup aa3**`
` 2 boy Family1 125`
` 3 **fem Family2 aa3**`
` 4 fem Family4 525`
` 5 masc NoGroup a52`
The condition to substitute de values in all rows of df['Surname'] column is:
if ((df['Gender']!= masc) & (df['Surname'] == 'NoGroup'))
The code must search for row that have equal ticket and substitute for the correspondent Surname value, else keep the value that already exists ('noGroup').
In this example, the ['Surname'] value in the row 1 ('noGroup') should be replace by 'family2', that corresponds row 3.
I tried this way, but it did not work
for i in zip((df['Gender']!='man') & df['Surname']=='noGroup'):
df['Surname'][i] = df.loc[df['Ticket']==df['Surname'][i]]
Upvotes: 2
Views: 61
Reputation: 164783
With Pandas you should aim for vectorised calculations rather than row-wise loops. Here's one approach. First convert selected values to None
:
df.loc[df['Gender'].ne('masc') & df['Surname'].eq('NoGroup'), 'Surname'] = None
Then create a series mapping from Ticket
to Surname
after a filter:
s = df[df['Surname'].notnull()].drop_duplicates('Ticket').set_index('Ticket')['Surname']
Finally, map null values with the calculated series:
df['Surname'] = df['Surname'].fillna(df['Ticket'].map(s))
Result:
Gender Surname Ticket
0 masc Family1 a12
1 fem Family2 aa3
2 boy Family1 125
3 fem Family2 aa3
4 fem Family4 525
5 masc NoGroup a52
Upvotes: 1