user42361
user42361

Reputation: 431

split a column into multiple lists, and keep deliminator

I have a dataframe, which I need to split a column on character "Y" and keep this deliminator. For example,

    import pandas as pd

    d1 = pd.DataFrame({'user': [1,2,3],'action': ['YNY','NN','NYYN']})

The output dataframe should look like this,

    d2 = pd.DataFrame([{'action': 'Y, NY', 'user': 1},
           {'action': 'NN', 'user': 2},
          {'action': 'NY, Y, N', 'user': 3}])

    in[1]: d1
    out[1]: action  user
            YNY         1
            NN          2
            NYYN        3

    in[2]: d2
    out[2]:  action user
            Y,NY        1
            NN          2
            NY,Y, N     3

I have tried a few questions such as Python split() without removing the delimiter and Python splitting on regex without removing delimiters. But they are not exactly what I am looking for here.

Upvotes: 0

Views: 50

Answers (2)

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

Use -

d1['action'].str.split('Y').str.join('Y,').str.rstrip(',')

Output

0      Y,NY
1        NN
2    NY,Y,N

Upvotes: 1

BENY
BENY

Reputation: 323226

Sounds like you need

d1.action.str.split('([^Y]*Y)').map(lambda x : [z for z in x  if z!= ''])
Out[234]: 
0       [Y, NY]
1          [NN]
2    [NY, Y, N]
Name: action, dtype: object

Upvotes: 1

Related Questions