Tim Zimmermann
Tim Zimmermann

Reputation: 6420

Define or determine type integer in python regex

Is it possible in Python to have the type of a capture group be an integer?
Let's assume I have the following regex:

>>> import re
>>> p = re.compile('[0-9]+')
>>> re.search(p, 'abc123def').group(0)
'123'

I wish that the type of '123' in the group was int, since it can only match integers. It feels like there has to be a better way than defining to only match numbers and then having to convert it to an int afterwards nevertheless.
The background is that I have a complex regex with multiple named capture groups, and some of those capture groups only match integers. I would like those capture groups to be of type integer.

Upvotes: 3

Views: 3118

Answers (4)

rturquier
rturquier

Reputation: 387

An example of use case: taking the average from two street numbers.

import pandas as pd

addresses = pd.Series(["3 - 5 Mint Road", "20-23 Cinnamon Street"])

def street_number_average(capture):
    number_1 = int(capture.group(1))
    number_2 = int(capture.group(2))
    average  = round((number_1 + number_2) / 2)
    return str(average)

pattern = r'(\d\d?) *?- *?(\d\d?)'

addresses.str.replace(pattern, street_number_average)

# > 0           4 Mint Road
# > 1    22 Cinnamon Street

Don't forget to convert back to string after doing the operations on the numbers, or it will return a NaN.

Upvotes: 1

uchuugaka
uchuugaka

Reputation: 12782

People might be misunderstanding the question due to wording.

They are correct in that Regular Expressions only operate on subclasses of basestring which includes str and unicode Python classes.

However within the domain of Regular Expressions there are symbols that match classes of characters (in Regular Expression terms) \d should do that for you.

See the pythex website or read up on Regular Expressions on other sites for more info.

Upvotes: -1

Leb
Leb

Reputation: 15953

Unfortunately that's the best you can do.

>>> import re
>>> p = re.compile('[0-9]+')
>>> a = re.search(p, 'abc123def').group(0)
>>> a.isdigit()
True
>>> a
'123'
>>> type(a)
<class 'str'>

Create an if statement from isdigit() and go from there.

Upvotes: 2

ShadowRanger
ShadowRanger

Reputation: 155438

No, there is not. You can convert it yourself, but re operates on text, and produces text, that's it.

Upvotes: 8

Related Questions