Ben
Ben

Reputation: 21645

how to extract numbers from a string ignoring number-letter mixtures

For example, the following regex extracts all "non-numbers" from a string

re.sub(r"\b[0-9]+\b", "", "5 1 inch c5 bolts 10")
'  inch c5 bolts '

How do I do the opposite? That is, how do I extract the numbers '5 1 10'? (Note: c5 is not a number, so it should not be included in the result)

Upvotes: 0

Views: 206

Answers (2)

PM 2Ring
PM 2Ring

Reputation: 55489

Since you're only looking for non-negative integers you can do this without regex by using the str.isdigit method.

s = "5 1 inch c5 bolts 10"
a = [u for u in s.split() if u.isdigit()]
print(a)
b = ' '.join(a)
print(repr(b))

output

['5', '1', '10']
'5 1 10'

If you actually want a list of the numbers as integers, you can modify the list comprehension like this:

a = [int(u) for u in s.split() if u.isdigit()]
print(a)

output

[5, 1, 10]

Upvotes: 1

mgilson
mgilson

Reputation: 310049

It looks like you already know about word boundaries... You're just looking for a word boundary, a string of numbers (and only numbers) and then another word boundary. The regex for that is \b\d+\b:

>>> re.findall(r'\b\d+\b', "5 1 inch c5 bolts 10")
['5', '1', '10']

Upvotes: 5

Related Questions