Reputation: 4338
I'm having trouble figuring out how to split a dataframe column based on a character and retaining that character string. Here's some example data:
df = pd.DataFrame(
{"sexage" : ['m45', 'f43']}
)
What I'd like is a separate column with the male/female letter and a separate column with the age.
When I do df['sexage'].str.split('m|f', expand=True)
, there's no value in the first column. But when I do df['sexage'].str.split('(m|f)', expand=True)
I get an extra blank column that I don't want.
I know I can select them by position with df['sexage'].str[0]
and df['sexage'].str[1:]
but I was wondering if I could do this with regex instead.
Upvotes: 1
Views: 44
Reputation: 150735
Try extract
df.sexage.str.extract('(\D+)(\d+)')
output:
0 1
0 m 45
1 f 43
Upvotes: 2