user3682875
user3682875

Reputation: 49

Python regex multiline

I am trying to extract some information from multiline text, but with no luck i am just gettign None what i am missing, i dont know?

content = """
Cisco IOS Software, C880 Software (C880DATA-UNIVERSALK9-M), Version 15.4(2)T1, RELEASE SOFTWARE 
(fc3)

ROM: System Bootstrap, Version 12.4(22r)YB5, RELEASE SOFTWARE (fc1)



Cisco 999 (MPC8300) processor (revision 1.0) with 236544K/25600K bytes of memory.
Processor board ID FTX0000088X






Configuration register is 0x210
"""



print()
match = re.search(r".* Version (?P<OS_Version>\S+), .* Processor board ID (?P<Serial_Number>.* 
Configuration register is (?P<config_register>\S+)$", 
content, flags=re.M) 
print(match)

Upvotes: 0

Views: 85

Answers (1)

user5510840
user5510840

Reputation:

There's many problem in your regex :

  • The flag re.DOTALL is missing to make . match newlines.
  • \S+ will match the comma after your version, which I don't think you want.
  • The capture group (?P<Serial_Number> is not closed.
  • It's a newline before Processor board ID and Configuration register, not a space.
  • You have a $ just after the config_register group, but there's actually a newline before the end of your text.

Depending on which version you want, you regex would be something like:

match = re.search(
    r".*?Version (?P<OS_Version>[\w().]+).*board ID (?P<Serial_Number>\w+).*register is (?P<config_register>\w+)",
    content,
    flags=re.M|re.DOTALL
)

or

match = re.search(
    r".*Version (?P<OS_Version>[\w().]+).*board ID (?P<Serial_Number>\w+).*register is (?P<config_register>\w+)",
    content,
    flags=re.M|re.DOTALL
)

The difference is that a ? is added after the .* at the start, making it non greedy and retrieving the code after the first occurence of Version.

Upvotes: 1

Related Questions