Paulo Henrique Zen
Paulo Henrique Zen

Reputation: 679

How to replace specific rows (based on conditions) using values with similar features condition in pandas?

i'm having a trouble when i wanna replace specific values that satisfies a condition and replace the values based on another condition.

Example of dataframe (df)

     Gender    Surname    Ticket
` 0   masc     Family1     a12`
` 1 **fem      NoGroup     aa3**`
` 2   boy      Family1     125`
` 3 **fem      Family2     aa3**`
` 4   fem      Family4     525`
` 5   masc     NoGroup     a52`

The condition to substitute de values in all rows of df['Surname'] column is:

if ((df['Gender']!= masc) & (df['Surname'] == 'NoGroup'))

The code must search for row that have equal ticket and substitute for the correspondent Surname value, else keep the value that already exists ('noGroup').

In this example, the ['Surname'] value in the row 1 ('noGroup') should be replace by 'family2', that corresponds row 3.

I tried this way, but it did not work

for i in zip((df['Gender']!='man') & df['Surname']=='noGroup'): df['Surname'][i] = df.loc[df['Ticket']==df['Surname'][i]]

Upvotes: 2

Views: 61

Answers (1)

jpp
jpp

Reputation: 164783

With Pandas you should aim for vectorised calculations rather than row-wise loops. Here's one approach. First convert selected values to None:

df.loc[df['Gender'].ne('masc') & df['Surname'].eq('NoGroup'), 'Surname'] = None

Then create a series mapping from Ticket to Surname after a filter:

s = df[df['Surname'].notnull()].drop_duplicates('Ticket').set_index('Ticket')['Surname']

Finally, map null values with the calculated series:

df['Surname'] = df['Surname'].fillna(df['Ticket'].map(s))

Result:

  Gender  Surname Ticket
0   masc  Family1    a12
1    fem  Family2    aa3
2    boy  Family1    125
3    fem  Family2    aa3
4    fem  Family4    525
5   masc  NoGroup    a52

Upvotes: 1

Related Questions