Reputation: 133

Split string with names and middle initial? (python3)

I have a dataset, however it includes Position #, "Lastname, Firstname M ",,

I am able to split it, however i want the middle initial gone, and the white space gone too, this is what i have:

df = pd.read_excel('C:\\HR employees\\EE Listing  as of 5-30-17.xlsx')

df['Last Name'], df['First Name'] = df['NAME'].str.split(',', 1).str
del df['NAME']

df.to_excel('output.xlsx')

It splits the firstname and last name, but how can i get rid of the middle initial?

Upvotes: 0

Answers (2)

Aditya Barve

Reputation: 1481

df = pd.read_excel('C:\\HR employees\\EE Listing  as of 5-30-17.xlsx')

df['Last Name'], df['First Name'] = df['NAME'].str.split(',').str
df['First Name'] = df['First Name'].apply(lambda s: s[:-2] if s[-2] == ' ' else s)
del df['NAME']    

df.to_excel('output.xlsx')

The lambda function checks if the second last character in the first name is a space. This removes the initial (if present) while also handling two-part first names.

Test cases: (NAME ---> Last Name ---> First Name)

Smith, John R ---> Smith ---> John
JOHN, DOE ---> JOHN ---> DOE
Jackson, Jean Paul M ---> Jackson ---> Jean Paul
Gonzales, El Torado ---> Gonzales ---> El Torado

Upvotes: 2

syntaxError

Reputation: 866

What about a loop to copy the first name up until the space between the middle letter?

df['First Name'] = 'dave M'
name = ''
for ch in df['First Name']:
     if ch != ' ':
          name += ch
     else: break
df['First Name'] = name


# OR ALTERNATIVLY 
df['First Name'] = df['First Name'].split()[0]

Upvotes: 2

Split string with names and middle initial? (python3)

Answers (2)

Related Questions