Giordan Pretelin
Giordan Pretelin

Reputation: 363

Split Pandas column, then join elements with matching element of another split column

I have a df with three columns with the following structure:

Name                    |   First Last Name    |    Second Last Name
---------------------------------------------------------------------
Name1/ Name2 / Name3    |   FLN1 / FLN2 / FLN3 |    SLN1 / / SLN3
---------------------------------------------------------------------
Name1                   |   FLN1               |    SLN1
---------------------------------------------------------------------
Name1 / Name2           |   FLN1 / FLN2        |     / SLN2

And wish to have something as follows:

|Full names                                         |
-----------------------------------------------------
|Name 1 FLN1 SLN1, Name 2 FLN 2, Name 3 FLN 3 SLN3  |
-----------------------------------------------------
|Name1 FLN1 SLN1                                    |
-----------------------------------------------------
|Name 1 FLN1, Name 2 FLN 2 SLN2                     |

Basically I'm trying to split every column by "/" and then join every element of the resulting array with the appropriate element from the next two columns' arrays.

Thanks a lot in advance

Upvotes: 0

Views: 213

Answers (1)

FChm
FChm

Reputation: 2580

Making some assumptions about the format of your data...

I would use the pandas built in str processing methods:

df = pd.DataFrame({'first':['A / B / C', 'F / G'], 'second':['D / / E', 'H / I']})

full_name_df = df['first'].str.split('/', expand=True) + df['second'].str.split('/', expand=True)

where full_name_df looks like:

     0     1     2
0   A D    B    C E
1   F H   G I   NaN

As you can see you then get a DataFrame with n columns (where n is the maximum number of names in a given cell) and the same number of rows as your original DataFrame. I also think in some situations having this additional 'full_name' DataFrame is an advantage, although you could always add it as a column of your original DataFrame.

Upvotes: 1

Related Questions