Grendel
Grendel

Reputation: 783

Extract part of a column using regex or split in python

Hello I have a df such as

COL1   COL2
G1     QANH010008.1:18255-18820(-):Hab_ob
G1     QANH010002:7-10(-):Hab_ob

and I would like to create 2 new COL3 and COL4 where i put the number before the first - and after the first -

Here the ouptut should be

COL1   COL2                                COL3   COL4
G1     QANH010008.1:18255-18820(+):Hab_ob  18255  18820
G1     QANH010002:7-10(-):Hab_ob           7      10 

Upvotes: 0

Views: 25

Answers (1)

mechanical_meat
mechanical_meat

Reputation: 169304

You can used named capturing groups for this then join to the original DataFrame. This answer incorporates a couple of suggestions from @MarkWang.

df.join(df['COL2'].str.extract(r'(?P<COL3>\d+)\-(?P<COL4>\d+)')) 

Output:

Out[206]: 
  COL1                                COL2   COL3   COL4
0   G1  QANH010008.1:18255-18820(-):Hab_ob  18255  18820
1   G1           QANH010002:7-10(-):Hab_ob      7     10

Upvotes: 2

Related Questions