Reputation: 1352
One of the columns is string. I want to split the string but it does not have a unique character to use as a spliter. Below is the sample data frame:
`df = pd.DataFrame({'Name':['John','David'],'Occupation':['CEO','Dep Dir'],'Contact':['HP No-Mobile Ph 123:456','Off-Mobile Ph 152:256']},`)
What I wanted to do is to split Contact.
My desired output will be as follow:
I used the following code to split at '-'.
df[['Contact1','Contact2']] = df.Contact.str.split('[-]',expand=True)
But the output is not the format that I wanted. Can anyone help me with that it is a specific problem which I cannot find it. Thanks,
Zep
Upvotes: 0
Views: 49
Reputation: 13255
First slice the unwanted data and then use split
(Assuming the length of data Ph is constant):
df[['Contact1','Contact2']] = df.Contact.str[:-8].str.split('[-]',expand=True)
If data after Ph is not constant use extract
on alphabets and space:
df[['Contact1','Contact2']] = df.Contact.str.split('[-]',expand=True)
df['Contact2'] = df.Contact2.str.extract('([a-zA-Z ]+)')[0].str.rstrip()
df = pd.DataFrame({'Name':['John','David'],
'Occupation':['CEO','Dep Dir'],
'Contact':['HP No-Mobile Ph 123:456','Off-Mobile Ph']},)
print(df)
Name Occupation Contact
0 John CEO HP No-Mobile Ph 123:456
1 David Dep Dir Off-Mobile Ph
df[['Contact1','Contact2']] = df.Contact.str.split('[-]',expand=True)
print(df)
Name Occupation Contact Contact1 Contact2
0 John CEO HP No-Mobile Ph 123:456 HP No Mobile Ph 123:456
1 David Dep Dir Off-Mobile Ph Off Mobile Ph
df['Contact2'] = df.Contact2.str.extract('([a-zA-Z ]+)')[0].str.rstrip()
print(df)
Name Occupation Contact Contact1 Contact2
0 John CEO HP No-Mobile Ph 123:456 HP No Mobile Ph
1 David Dep Dir Off-Mobile Ph Off Mobile Ph
Upvotes: 1
Reputation: 863146
I believe you need split
by -
for 2 columns and then rsplit
by last whitespace:
df[['Contact1','Contact2']] = df.Contact.str.split('-',expand=True)
df['Contact2'] = df['Contact2'].str.rsplit(n=1).str[0]
print (df)
Name Occupation Contact Contact1 Contact2
0 John CEO HP No-Mobile Ph 123:456 HP No Mobile Ph
1 David Dep Dir Off-Mobile Ph 152:256 Off Mobile Ph
Upvotes: 1
Reputation: 4607
df[['Contact1','Contact2']] = df['Contact'].str.split('-' or ' ',expand=True)
df.Contact2 = df.Contact2.str.split(' ').str[:-1].apply(' '.join)
Out:
Contact Name Occupation Contact1 Contact2
0 HP No-Mobile Ph 123:456 John CEO HP No Mobile Ph
1 Off-Mobile Ph 152:256 David Dep Dir Off Mobile Ph
Upvotes: 1