Henry
Henry

Reputation: 411

Filter duplicates and append a character to each item

I am working on the following Dataframe:

print (df)

    LN        FN
0   Smith     Jason
1   Smith     Pat
2   Smith     Liz
3   Kim       Jim
4   Hazel     Vickie
5   Sun       Sandra

I would like to filter the duplicated names on ['LN'] and put a first character of a name from ['FN']. In this example, I would like to add 'J', 'P', and 'L' to each 'Smith' with the space on ['LN'].

Desired output would be:

print (df)

    LN        FN
0   Smith J   Jason
1   Smith P   Pat
2   Smith L   Liz
3   Kim       Jim
4   Hazel     Vickie
5   Sun       Sandra

My attempt:

My code below achieved the desired output but there should be a cleaner and more pandas-like way of achieving this.

df1 = df.loc[df.duplicated('LN', False)]
df2 = pd.DataFrame(df1.LN + ' '+ df1.FN.str.get(0))
df3 = pd.concat([df1,df2], axis=1)
df3 = df3[[0, 'FN']]
df3.columns = ['LN', 'FN']
df.update(df3)

Thank you for your help on this!

Upvotes: 2

Views: 210

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210842

you can do it this way:

In [41]: df.loc[df.LN.duplicated(keep=False), 'LN'] += ' ' + df.FN.str[0]

In [42]: df
Out[42]:
        LN      FN
0  Smith J   Jason
1  Smith P     Pat
2  Smith L     Liz
3      Kim     Jim
4    Hazel  Vickie
5      Sun  Sandra

Upvotes: 4

Related Questions