Sergey
Sergey

Reputation: 21201

Python regexp for data of byte numbers

How to make a regexp matching for a row of bytes?
For example how to check with regexp that binary data consists of (0-10 byte) characters?

data = 0x00 0x05 0x02 0x00 0x03 ... (not a string, binary data)

Upvotes: 3

Views: 7532

Answers (3)

Chinmay Kanchi
Chinmay Kanchi

Reputation: 65993

If you want check if all characters in the given string are in the range 0x00 to 0x0B (not inclusive), regex is way overkill. Try something like this:

>>> check_range = lambda x: ord(x) in set(range(0x00, 0x0B))
>>> s = '\x1\x2\x3\xA'
>>> s2 = 'abcde'

>>> print all(check_range(c) for c in s)
True
>>> print all(check_range(c) for c in s2)
False
>>>

Upvotes: 0

Marcelo Cantos
Marcelo Cantos

Reputation: 185962

This will match any code before space:

if re.search('[\0-\037]', line):
    # Contains binary data...

I'm not sure what you mean by "0-10 byte", but if you mean that you want to match only the byte values 0 to 10, then replace \037 with \012 in the above code.

Note that 0-10 aren't really the only codes that would suggest binary data; anything below \040 or above \0177 usually suggests binary data.

Upvotes: 2

Maxim Razin
Maxim Razin

Reputation: 9466

If you want to check that the string contains only characters between chr(0) and chr(10), simply use

re.match('^[\0-\x0A]*$',data)

For Python3, you can do the same with byte strings:

re.match(b'^[\0-\x0A]*$',b'\x01\x02\x03\x04')

Upvotes: 5

Related Questions