RhinoH
RhinoH

Reputation: 25

extract pattern from string in python unexpected result

x = pd.Series(['CA1234567', 'QWCEC']) 
x.str.extract(r'(CA|US)\d{7}$')

the expected result is [CA1234567, Nan], but get [CA, Nan].

Upvotes: 0

Views: 28

Answers (2)

bigbounty
bigbounty

Reputation: 17368

Choose the first group after the regex

In [105]: x = pd.Series(['CA1234567', 'QWCEC'])
     ...: x.str.extract(r'((CA|US)\d{7})$')[0].tolist()
Out[105]: ['CA1234567', nan]

Upvotes: 1

rioV8
rioV8

Reputation: 28673

Include the number in the capture group

x = pd.Series(['CA1234567', 'QWCEC']) 
x.str.extract(r'((CA|US)\d{7})$')

Upvotes: 1

Related Questions