Nikhil Mangire
Nikhil Mangire

Reputation: 407

How to split a dataframe using pandas?

I have following dataframe to process, DF

Name                City
Hat, Richards       Paris
Adams               New york
Tim, Mathews        Sanfrancisco
chris, Moya De      Las Vegas
kate, Moris         Atlanta
Grisham HA          Middleton
James, Tom, greval  Rome

And my expected dataframe should be as following, DF

Name         Last_name           City
Hat          Richards            Paris
             Adams               New york
Tim          Mathews             Sanfrancisco
chris        Moya De             Las Vegas
kate         Moris               Atlanta
             Grisham HA          Middleton
James, Tom   greval              Rome

Splitting should be done on last ',' and if there is no ',' then entire other words or phrase should fall in column 'Last_name' and 'Name' column should remain vacant.

Upvotes: 4

Views: 67

Answers (3)

jezrael
jezrael

Reputation: 863741

Use str.split with radd for add ,, last str.lstrip:

df[['first','last']] = df['Name'].radd(', ').str.rsplit(', ', n=1, expand=True)
df['first'] = df['first'].str.lstrip(', ')
print (df)
                 Name          City       first        last
0       Hat, Richards         Paris         Hat    Richards
1               Adams      New york                   Adams
2        Tim, Mathews  Sanfrancisco         Tim     Mathews
3      chris, Moya De     Las Vegas       chris     Moya De
4         kate, Moris       Atlanta        kate       Moris
5          Grisham HA     Middleton              Grisham HA
6  James, Tom, greval          Rome  James, Tom      greval

Upvotes: 4

piRSquared
piRSquared

Reputation: 294546

Quick and Dirty

Use pandas.str.split with str[::-1] to reverse the order

df[['Last_name', 'Name']] = df.Name.str.split(', ').str[::-1].apply(pd.Series)

df

    Name          City   Last_name
0    Hat         Paris    Richards
1    NaN      New york       Adams
2    Tim  Sanfrancisco     Mathews
3  chris     Las Vegas     Moya De
4   kate       Atlanta       Moris
5    NaN     Middleton  Grisham HA

Upvotes: 4

BENY
BENY

Reputation: 323396

Using str.split with n=-1(This is default you can change what you need)

newdf=df.Name.str.split(', ',expand=True,n=1).ffill(1)
newdf.loc[newdf[0]==newdf[1],0]=''
newdf
Out[923]: 
       0          1
0    Hat   Richards
1             Adams
2    Tim    Mathews
3  chris     MoyaDe
4   kate      Moris
5         GrishamHA
df[['Name','LastName']]=newdf
df
Out[925]: 
    Name          City   LastName
0    Hat         Paris   Richards
1              Newyork      Adams
2    Tim  Sanfrancisco    Mathews
3  chris      LasVegas     MoyaDe
4   kate       Atlanta      Moris
5            Middleton  GrishamHA

Upvotes: 4

Related Questions