Reputation: 6420
Is it possible in Python to have the type of a capture group be an integer?
Let's assume I have the following regex:
>>> import re
>>> p = re.compile('[0-9]+')
>>> re.search(p, 'abc123def').group(0)
'123'
I wish that the type of '123'
in the group was int, since it can only match integers. It feels like there has to be a better way than defining to only match numbers and then having to convert it to an int afterwards nevertheless.
The background is that I have a complex regex with multiple named capture groups, and some of those capture groups only match integers. I would like those capture groups to be of type integer.
Upvotes: 3
Views: 3118
Reputation: 387
An example of use case: taking the average from two street numbers.
import pandas as pd
addresses = pd.Series(["3 - 5 Mint Road", "20-23 Cinnamon Street"])
def street_number_average(capture):
number_1 = int(capture.group(1))
number_2 = int(capture.group(2))
average = round((number_1 + number_2) / 2)
return str(average)
pattern = r'(\d\d?) *?- *?(\d\d?)'
addresses.str.replace(pattern, street_number_average)
# > 0 4 Mint Road
# > 1 22 Cinnamon Street
Don't forget to convert back to string after doing the operations on the numbers, or it will return a NaN
.
Upvotes: 1
Reputation: 12782
People might be misunderstanding the question due to wording.
They are correct in that Regular Expressions only operate on subclasses of basestring which includes str and unicode Python classes.
However within the domain of Regular Expressions there are symbols that match classes of characters (in Regular Expression terms)
\d
should do that for you.
See the pythex website or read up on Regular Expressions on other sites for more info.
Upvotes: -1
Reputation: 15953
Unfortunately that's the best you can do.
>>> import re
>>> p = re.compile('[0-9]+')
>>> a = re.search(p, 'abc123def').group(0)
>>> a.isdigit()
True
>>> a
'123'
>>> type(a)
<class 'str'>
Create an if
statement from isdigit()
and go from there.
Upvotes: 2
Reputation: 155438
No, there is not. You can convert it yourself, but re
operates on text, and produces text, that's it.
Upvotes: 8