Reputation: 1
I have a users database. I want to separate them into two columns to have user1 and user2. The way I was solving this was to split the names into multiple columns then merge the names to have the two columns of users.
The issue I run into is some names are long and after the split. Those names take some spot on the data frame which makes it harder to merge properly.
Users |
---|
Maria Melinda Del Valle Justin Howard |
Devin Craig Jr. Michael Carter III |
Jeanne De Bordeaux Alhamdi |
After I split the user columns
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|---|
Maria | Melinda | Del | Valle | Justin | Howard | |||
Devin | Craig | Jr. | Michael | Carter | III | |||
Jeanne | De | Bordeaux | Alhamdi |
The expected result is the following
User1 | User2 |
---|---|
Maria Melinda Del valle | Justin Howard |
Devin Craig Jr. | Michael Carter III |
Jeanne De Bordeaux | Alhamdi |
Upvotes: 0
Views: 76
Reputation: 120499
You can use:
def f(sr):
m = sr.isna().cumsum().loc[lambda x: x < 2]
return sr.dropna().groupby(m).apply(' '.join)
out = df.apply(f, axis=1).rename(columns=lambda x: f'User{x+1}')
Output:
>>> out
User1 User2
0 Maria Melinda Del Valle Justin Howard
1 Devin Craig Jr. Michael Carter III
2 Jeanne De Bordeaux Alhamdi
As suggested by @Barmar, If you know where to put the blank columns in the first split, you should know how to create both columns.
Upvotes: 1