myusrn
myusrn

Reputation: 1110

changing column data with "lastname, firstname" to "firstname lastname" in python pandas dataframe

I have python pandas dataframe, served up by power bi data source transformation support for execution of python script, where one of the columns consists of lastname, firstname and I need it to consist of firstname lastname.

I've tried the following split, reversed, join approach which works on a standalone string argument but generates AttributeError: 'Series' object has no attribute 'split' when I try it on column data in a pandas dataframe.

name = 'LastName, FirstName'
' '.join(reversed(name.split(', ')))
# output = 'FirstName LastName'

import pandas as pd
df = pd.DataFrame({'full_name': ['doe, john', 'smith, kate', 'jones, susan', 'edwards, jack' ],
                   'num_legs': [2, 4, 8, 0],
                   'num_wings': [2, 0, 0, 0],
                   'num_specimen_seen': [10, 2, 1, 8]},
                   index=['falcon', 'dog', 'spider', 'fish'])
df
df['full_name'] = ' '.join(reversed(df['full_name'].split(', ')))
# output = AttributeError: 'Series' object has no attribute 'split'

Searching SO i see hits for doing this something of this nature in an excel column and in a R list but nothing i have been able to find yet for column in a python pandas dataframe.

Upvotes: 2

Views: 1318

Answers (3)

sammywemmy
sammywemmy

Reputation: 28709

A combination of pandas' string methods could help here : for speed, I would suggest running a list comprehension within python itself. The string methods in Pandas are provided primarily for convenience/simplicity.

df['full_name'] = df.full_name.str.split(",").str[::-1].str.join(",")


          full_name     num_legs    num_wings   num_specimen_seen
falcon     john,doe         2           2             10
dog        kate,smith       4           0             2
spider     susan,jones      8           0             1
fish       jack,edwards     0           0             8

Upvotes: 3

ajmartin
ajmartin

Reputation: 2409

The error is because the variable's type type(df['full_name']) is <class 'pandas.core.series.Series'>. Convert it to list and then operate:

import pandas as pd
df = pd.DataFrame({'full_name': ['doe, john', 'smith, kate', 'jones, susan', 'edwards, jack' ],
        'num_legs': [2, 4, 8, 0],
        'num_wings': [2, 0, 0, 0],
        'num_specimen_seen': [10, 2, 1, 8]},
        index=['falcon', 'dog', 'spider', 'fish'])

print(map(lambda x: x.split(',')[::-1], df['full_name'].tolist()))

Upvotes: 0

BENY
BENY

Reputation: 323326

In your case we can do split with map PS : ::-1 here is reversed the order

df.full_name=df.full_name.str.split(', ').map(lambda x : ' '.join(x[::-1]))
df.full_name
falcon        john doe
dog         kate smith
spider     susan jones
fish      jack edwards
Name: full_name, dtype: object

Upvotes: 3

Related Questions