dizzy
dizzy

Reputation: 49

How to extract a float from a string after a keyword in python

I have the following string from which I need to extract the value 14.123456 which is directly after the keyword airline_freq: (which is a unique keyword in my string)

Please help find the correct regex (indexing m.group() doesn't work beyond 0)

import re
s =  "DATA:init:     221.000OTHER:airline_freq:  14.123456FEATURE:airline_amp:   0.333887 more text"
m = re.search(r'[airline_freq:\s]?\d*\.\d+|\d+', s)
m.group()

$ result 221.000

Upvotes: 0

Views: 545

Answers (3)

Majoris
Majoris

Reputation: 3189

This will match only the float as a single group.

r'airline_freq:\s+([-0-9.]+)'

"DATA:init:     221.000OTHER:airline_freq:  14.123456FEATURE:airline_amp:   0.333887 more text"

Upvotes: 1

Enlico
Enlico

Reputation: 28406

You can probably use this:

(?<=airline_freq:)\s*(?:-?(?:\d+(?:\.\d*)?|\.\d+))

This uses a lookbehind to enforce that the number is preceded by airline_freq: but it does not make it part of the match.

The number-matching part of the regex can match numbers with or without . and, if there is ., it can also be just leading or trailing (in this case clearly not before the - sign). You can also allow an optional + instead of the -, by using [+-] instead of -.

Unfortunately it seems Python does not allow variable length lookbehind, so I cannot put the \s* in it; the consequence is that the spaces between the : and the number are part of the match. This in general could be no problem, as leading spaces when giving a number to a program are generally skipped automatically.

However, you can still remove the first ?: in the regex above to make the number-matching group capturing, so that the number is available as \1.

The example is here.

Upvotes: 1

kinoute
kinoute

Reputation: 167

I have this:

(?<=airline_freq\:\s\s)(\d+\.\d+)

In [2]: import re
   ...: s =  "DATA:init:     221.000OTHER:airline_freq:  14.123456FEATURE:airline_amp:   0.333887 more text"
   ...: m = re.search(r'(?<=airline_freq\:\s\s)(\d+\.\d+)', s)
   ...: m.group()
Out[2]: '14.123456'

Test: https://regexr.com/51q41

If you're not sure about the number of spaces between airline_freq: and the desired float number, you can use:

(?<=airline_freq\:)\s*(\d+\.\d+)

and m.group().lstrip() to get rid of the left spaces.

Upvotes: 0

Related Questions