Dance Party
Dance Party

Reputation: 3713

Pandas Split on '. '

Given the following data frame:

import pandas as pd
df=pd.DataFrame({'foo':['abc','2. abc','3. abc']})
df

 foo
   abc
2. abc
3. abc

I'd like to split on '. ' to produce this:

foo   bar
      abc
1     abc
2     abc

Thanks in advance!

Upvotes: 2

Views: 103

Answers (3)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210922

you can do it using .str.extract() function:

In [163]: df.foo.str.extract(r'(?P<foo>\d*)[\.\s]*(?P<bar>.*)', expand=True)
Out[163]:
  foo  bar
0      abc
1   2  abc
2   3  abc

Upvotes: 1

ysearka
ysearka

Reputation: 3855

If you have a folder you can put a temporary file into, you can create a csv file and reread it with your new separator:

df.to_csv('yourfolder/yourfile.csv',index = False)

df = pd.read_csv('yourfolder/yourfile.csv',sep = '. ')

Upvotes: 1

jezrael
jezrael

Reputation: 863291

You can use str.split, but then you need swap values if mask is True by numpy.where. Last fillna by '' column foo:

df1 = (df.foo.str.split('. ', expand=True))
df1.columns = ['foo','bar']

print (df1)
   foo   bar
0  abc  None
1    2   abc
2    3   abc

mask = df1.bar.isnull()
print (mask)
0     True
1    False
2    False
Name: bar, dtype: bool

df1['foo'], df1['bar'] = np.where(mask, df1['bar'], df1['foo']), 
                         np.where(mask, df1['foo'], df1['bar'] )

df1.foo.fillna('', inplace=True)

print (df1)
  foo  bar
0      abc
1   2  abc
2   3  abc

Upvotes: 1

Related Questions