Leonardo Farias
Leonardo Farias

Reputation: 65

Python regex Get first element after specific string

I'm trying to get the first number (int and float) after a specific pattern:

strings = ["Building 38 House 10",
           "Building : 10.5 house 900"]
for x in string:
    print(<rule>)

Wanted result:

'38'
'10.5'

I tried:

for x in strings:
    print(re.findall(f"(?<=Building).+\d+", x))
    print(re.findall(f"(?<=Building).+(\d+.?\d+)", x))
[' 38 House 10']
['10']
[' : 10.5 house 900']
['00']

But I'm missing something.

Upvotes: 2

Views: 505

Answers (3)

bobble bubble
bobble bubble

Reputation: 18490

An idea to use \D (negated \d) to match any non-digits in between and capture the number:

Building\D*\b([\d.]+)

See this demo at regex101 or Python demo at tio.run

Just to mention, use word boundaries \b around Building to match the full word.

Upvotes: 1

The fourth bird
The fourth bird

Reputation: 163352

You could use a capture group:

\bBuilding[\s:]+(\d+(?:\.\d+)?)\b

Explanation

  • \bBuilding Match the word Building
  • [\s:]+ Match 1+ whitespace chars or colons
  • (\d+(?:\.\d+)?) Capture group 1, match 1+ digits with an optional decimal part
  • \b A word boundary

Regex demo

import re
strings = ["Building 38 House 10",
           "Building : 10.5 house 900"]
pattern = r"\bBuilding[\s:]+(\d+(?:\.\d+)?)"
for x in strings:
    m = re.search(pattern, x)
    if m:
        print(m.group(1))

Output

38
10.5

Upvotes: 2

Hidi Eric
Hidi Eric

Reputation: 364

re.findall(r"(?<![a-zA-Z:])[-+]?\d*\.?\d+", x)

This will find all numbers in the given string.

If you want the first number only you can access it simply through indexing:

re.findall(r"(?<![a-zA-Z:])[-+]?\d*\.?\d+", x)[0]

Upvotes: 0

Related Questions