Reputation: 888
I have a list of strings and I want to use regex to get a single digit if there are no digits before it.
strings = ['5.8 GHz', '5 GHz']
for s in strings:
print(re.findall(r'\d\s[GM]?Hz', s))
# output
['8 GHz']
['5 GHz']
# desired output
['5 GHz']
I want it to just return '5 GHz', the first string shouldn't have any matches. How can I modify my pattern to get the desired output?
Upvotes: 1
Views: 536
Reputation: 324
Updated Answer
import re
a = ['5.8 GHz', '5 GHz', '8 GHz', '1.2', '1.2 Some Random String', '1 Some String', '1 MHz of frequency', '2 Some String in Between MHz']
res = []
for fr in a:
if re.match('^[0-9](?=.[^0-9])(\s)[GM]Hz$', fr):
res.append(fr)
print(res)
Output:
['5 GHz', '8 GHz']
Upvotes: 1
Reputation: 3629
My two cents:
selected_strings = list(filter(
lambda x: re.findall(r'(?:^|\s+)\d+\s+(?:G|M)Hz', x),
strings
))
With ['2 GHz', '5.8 GHz', ' 5 GHz', '3.4 MHz', '3 MHz', '1 MHz of Frequency']
as strings
, here selected_strings
:
['2 GHz', ' 5 GHz', '3 MHz', '1 MHz of Frequency']
Upvotes: 0
Reputation: 75840
As per my comment, it seems that you can use:
(?<!\d\.)\d+\s[GM]?Hz\b
This matches:
(?<!\d\.)
- A negative lookbehind to assert position is not right after any single digit and literal dot.\d+
- 1+ numbers matching the integer part of the frequency.[GM]?Hz
- An optional uppercase G or M followed by "Hz".\b
- A word boundary.Upvotes: 2
Reputation: 687
>>> strings = ['5.8 GHz', '5 GHz']
>>>
>>> for s in strings:
... match = re.match(r'^[^0-9]*([0-9] [GM]Hz)', s)
... if match:
... print(match.group(1))
...
5 GHz
Upvotes: 2