Platalea Minor
Platalea Minor

Reputation: 877

How to split based on multiple delimiter pandas

I have a df column which contains

Phone number
12399422/930201021
5451354;546325642
789888744,656313214
123456654

I would like to separate it into two columns

Phone number1      Phone number2
12399422           930201021
5451354            546325642
789888744          656313214
123456654

I have tried to use apply,

    df['TELEPHONE1'] = df['TELEPHONE'].str.split(',').str.get(0)
    df['TELEPHONE2'] = df['TELEPHONE'].str.split(',').str.get(1)
    df['TELEPHONE1'] = df['TELEPHONE'].str.split(';').str.get(0)
    df['TELEPHONE2'] = df['TELEPHONE'].str.split(';').str.get(1)
    df['TELEPHONE1'] = df['TELEPHONE'].str.split('/').str.get(0)
    df['TELEPHONE2'] = df['TELEPHONE'].str.split('/').str.get(1)

but it can only split '/' Thank you for your help

Upvotes: 9

Views: 18242

Answers (3)

jsonao
jsonao

Reputation: 21

df.Phonenumber.str.split('[\/\|\;\|\,]',expand=True).add_prefix('Number_')

this should be the right answer

Upvotes: 2

cs95
cs95

Reputation: 402323

Just split on \D (non-digit):

(df['Phone number'].str.split(r'\D', expand=True)
    .rename(columns=lambda x: f'num{x+1}'))

        num1       num2
0   12399422  930201021
1    5451354  546325642
2  789888744  656313214
3  123456654       None

...Under the assumption that phone numbers only contain digits.


Alternatively, use extractall and match digits only:

df['Phone number'].str.extractall(r'(\d+)')[0].unstack()

match          0          1
0       12399422  930201021
1        5451354  546325642
2      789888744  656313214
3      123456654        NaN

Upvotes: 3

BENY
BENY

Reputation: 323226

Using str.split

df.Phonenumber.str.split('/|;|,',expand=True).add_prefix('Number_')
    Number_0   Number_1
0   12399422  930201021
1    5451354  546325642
2  789888744  656313214
3  123456654       None

Upvotes: 16

Related Questions