Reputation: 25
x = pd.Series(['CA1234567', 'QWCEC'])
x.str.extract(r'(CA|US)\d{7}$')
the expected result is [CA1234567, Nan]
, but get [CA, Nan]
.
Upvotes: 0
Views: 28
Reputation: 17368
Choose the first group after the regex
In [105]: x = pd.Series(['CA1234567', 'QWCEC'])
...: x.str.extract(r'((CA|US)\d{7})$')[0].tolist()
Out[105]: ['CA1234567', nan]
Upvotes: 1
Reputation: 28673
Include the number in the capture group
x = pd.Series(['CA1234567', 'QWCEC'])
x.str.extract(r'((CA|US)\d{7})$')
Upvotes: 1