Reputation: 2158
I am trying to split a pandas column value with out loosing its deli-meter. Here is the stack-overflow that I am following. It is working well when I pass a string, however it doesn't work when I want it to split by '/m'. I tried different regex, but doesn't seem work either. Any suggestions?
import pandas as pd
ls = [
{'ID': 'ABC',
'LongString': '/m/04abc3 1 1 1 1 /m/04ccc32 3 3 3 3'},
{'ID': 'CDE',
'LongString': '/m/04abc4 2 2 2 2 /m/04ccc12 4 4 4 4'}
]
df = pd.DataFrame(ls)
df['LongString'] = df['LongString'].str.split('(?<=/m)\s') # tried removing `/` and put in `m` for testing. Did not do the trick.
I am trying to get it to look like this. What am I doing wrong here?
pandas dataframe format:
ID | LongString
ABC | ['/m/04abc3 1 1 1 1', '/m/04ccc32 3 3 3 3']
CDE | ['/m/04abc4 2 2 2 2', '/m/04ccc12 4 4 4 4']
Upvotes: 1
Views: 38
Reputation: 654
It looks as if you want to split on a white space followed by /m
. In regex language, you want a lookahead rather than a lookbehind.
Proposed solution:
df['LongString'] = df['LongString'].str.split('\s(?=/m)')
Upvotes: 3