Reputation: 888
I have some strings which contain a pairs of frequencies or pairs of frequency ranges
My regex function gets the following list from that example string:
example_string = ':2.400-2.483ghz;5.725-5.850ghz

transmissionpower(eirp),2'
re.findall(r"(\d+\.\d+.hz)", example_string)
# example output: ['2.483ghz', '5.850ghz']
How can I extract the range of frequencies rather than just the single float after the -
character?
Output should be ['2.400-2.483ghz', '5.725-5.850ghz']
Upvotes: 0
Views: 62
Reputation: 19603
Something like this should (mostly) work to find all the occurences of those strings in the code (it should handle any number of ranges in the line):
>>> example_string = ':2.400-2.483ghz;5.725-5.850ghz

transmissionpower(eirp),2'
>>> re.findall('([0-9.]+-[0-9.]+.?hz)', example_string)
['2.400-2.483ghz', '5.725-5.850ghz']
To break it down:
[0-9.]+
- will find 1 or more numbers and .s together (e.g. 2.400).?hz
finds 0 or 1 characters followed by 'hz' so it should handle most units (e.g. hz, ghz, etc.)The whole thing essentially looks for <number><dash><number><units>
zero or more times per line.
It's worth pointing out that, like most regexes, this is still pretty brittle so if the string is malformatted, if it's GHz instead of ghz, if the numbers are in scientific notation, etc., it will break, but hopefully you can adjust as needed.
Upvotes: 3
Reputation: 785128
You may use this regex:
(?:\d+\.\d+-)?\d+\.\d+.hz
Code:
>>> import re
>>> s = ':2.400-2.483ghz;5.725-5.850ghz

transmissionpower(eirp),2'
>>> re.findall(r'(?:\d+\.\d+-)?\d+\.\d+.hz', s);
['2.400-2.483ghz', '5.725-5.850ghz']
Explanation:
(?:\d+\.\d+-)?
: In an optional group match a floating point number followed by hyphen\d+\.\d+
: Match a floating point number.hz
: Match any character followed by hz
Upvotes: 2