Reputation: 53
I have the following strings:
String 1-
Cisco IOS Software, C3900 Software (C3900-UNIVERSALK9-M), Version 15.4(3)M3, RELEASE SOFTWARE (fc2) ROM: System Bootstrap, Version 15.0(1r)M16, RELEASE SOFTWARE (fc1)
String2-
Cisco IOS XE Software, Version 16.05.01b
Cisco IOS Software [Everest], ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 16.5.1b, RELEASE SOFTWARE (fc1)
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
GPL code under the terms of GPL Version 2.0. For more details, see the
from both the strings I need to get only 16.05.01b
and 15.4(3)M3
when I run Regex.
I have tried this r'((?<=Version\s)\d+\.\d+\(\d+...)'
I am able to fetch 15.4(3)M3
not 16.05.01b
.
and r'((?<=Version\s)\d+\.\d+\(\d+...)'
one regular expression should be able to fetch the version from both the strings, but both do not give me the result.
Upvotes: 4
Views: 615
Reputation: 436
Well that's because your regex expects to find a parenthesis when searching for the version, which is not present in the second string.
This is an easy way to solve it (borrowed the strings from abdusco):
strings = [
'-M), Version 15.4(3)M3, RELEA',
'rap, Version 15.0(1r)M16, RELEA',
', Version 16.5.1b, RELEASE']
versions = []
version = re.compile(r'(?<=Version\s)\d+\.\d........')
for s in strings:
v = version.search(s).group(0).split(',')[0]
version.append(v)
Upvotes: 1
Reputation: 11081
In your examples a version is prefixed with Version
and includes:
Here, I model version as something that starts with a number and continues with a combination of the items above.
This should work:
import re
strings = [
'-M), Version 15.4(3)M3, RELEA',
'rap, Version 15.0(1r)M16, RELEA',
', Version 16.5.1b, RELEASE',
're, Version 16.05.01b'
]
version_re = re.compile(r'version (\d[\w.()]+)', flags=re.IGNORECASE)
for s in strings:
v = version_re.search(s).group(1)
print(v)
output:
15.4(3)M3
15.0(1r)M16
16.5.1b
16.05.01b
Upvotes: 3
Reputation: 163277
You could use an alternation to get both the values.
You might also omit the capturing group as it is the only match to match either an opening till closing parenthesis followed by A-Z and a digit or match a dot, 2 digits and a character a-z
(?<=Version\s)\d+\.\d+(?:\([^()+]\)[A-Z]\d|\.\d{2}[a-z])
A more efficient version could be using a capturing group instead of the lookbehind:
Version\s(\d+\.\d+(?:\([^()+]\)[A-Z]\d|\.\d{2}[a-z]))
import re
regex = r"(?<=Version\s)\d+\.\d+(?:\([^()+]\)[A-Z]\d|\.\d{2}[a-z])"
test_str = ("String 1-Cisco IOS Software, C3900 Software (C3900-UNIVERSALK9-M), Version 15.4(3)M3, RELEASE SOFTWARE (fc2)\n"
"ROM: System Bootstrap, Version 15.0(1r)M16, RELEASE SOFTWARE (fc1)\n\n"
"String2-Cisco IOS XE Software, Version 16.05.01b\n"
"Cisco IOS Software [Everest], ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 16.5.1b, RELEASE SOFTWARE (fc1)\n"
"licensed under the GNU General Public License (\"GPL\") Version 2.0. The\n"
"software code licensed under GPL Version 2.0 is free software that comes\n"
"GPL code under the terms of GPL Version 2.0. For more details, see the")
print (re.findall(regex, test_str))
Result
['15.4(3)M3', '16.05.01b']
Upvotes: 1