pidgey
pidgey

Reputation: 245

python get substring from regex

I want to extract a substring from a string, which is conform to a certain regex. The regex is:

(\[\s*(\d)+ byte(s)?\s*\](\s*|\d|[A-F]|[a-f])+)

Which effectively means that all of these strings get accepted:

[4 bytes] 66 74 79 70 33 67 70 35
[ 4 bytes ] 66 74 79 70 33 67 70 35
[1 byte] 66 74 79 70 33 67 70 35

I want to extract only the amount of bytes (just the number) from this string. I thought of doing this with re.search, but I'm not sure if that will work. What would be the cleanest and most performant way of doing this?

Upvotes: 1

Views: 245

Answers (1)

user1907906
user1907906

Reputation:

Use match.group to get the groups your regular expression defines:

import re

s = """[4 bytes] 66 74 79 70 33 67 70 35
[ 4 bytes ] 66 74 79 70 33 67 70 35
[1 byte] 66 74 79 70 33 67 70 35"""
r = re.compile(r"(\[\s*(\d)+ byte(s)?\s*\](\s*|\d|[A-F]|[a-f])+)")

for line in s.split("\n"):
    m = r.match(line)
    if m:
        print(m.group(2))

The first group matches [4 bytes], the second only 4.

Output:

4
4
1

Upvotes: 6

Related Questions