Reputation: 3713
Given the following data frame:
import pandas as pd
df=pd.DataFrame({'foo':['abc','2. abc','3. abc']})
df
foo
abc
2. abc
3. abc
I'd like to split on '. ' to produce this:
foo bar
abc
1 abc
2 abc
Thanks in advance!
Upvotes: 2
Views: 103
Reputation: 210922
you can do it using .str.extract() function:
In [163]: df.foo.str.extract(r'(?P<foo>\d*)[\.\s]*(?P<bar>.*)', expand=True)
Out[163]:
foo bar
0 abc
1 2 abc
2 3 abc
Upvotes: 1
Reputation: 3855
If you have a folder you can put a temporary file into, you can create a csv file and reread it with your new separator:
df.to_csv('yourfolder/yourfile.csv',index = False)
df = pd.read_csv('yourfolder/yourfile.csv',sep = '. ')
Upvotes: 1
Reputation: 863291
You can use str.split
, but then you need swap values if mask
is True
by numpy.where
. Last fillna
by ''
column foo
:
df1 = (df.foo.str.split('. ', expand=True))
df1.columns = ['foo','bar']
print (df1)
foo bar
0 abc None
1 2 abc
2 3 abc
mask = df1.bar.isnull()
print (mask)
0 True
1 False
2 False
Name: bar, dtype: bool
df1['foo'], df1['bar'] = np.where(mask, df1['bar'], df1['foo']),
np.where(mask, df1['foo'], df1['bar'] )
df1.foo.fillna('', inplace=True)
print (df1)
foo bar
0 abc
1 2 abc
2 3 abc
Upvotes: 1