Reputation: 21201
How to make a regexp matching for a row of bytes?
For example how to check with regexp that binary data consists of (0-10 byte) characters?
data = 0x00 0x05 0x02 0x00 0x03
... (not a string, binary data)
Upvotes: 3
Views: 7532
Reputation: 65993
If you want check if all characters in the given string are in the range 0x00
to 0x0B
(not inclusive), regex is way overkill. Try something like this:
>>> check_range = lambda x: ord(x) in set(range(0x00, 0x0B))
>>> s = '\x1\x2\x3\xA'
>>> s2 = 'abcde'
>>> print all(check_range(c) for c in s)
True
>>> print all(check_range(c) for c in s2)
False
>>>
Upvotes: 0
Reputation: 185962
This will match any code before space:
if re.search('[\0-\037]', line):
# Contains binary data...
I'm not sure what you mean by "0-10 byte", but if you mean that you want to match only the byte values 0 to 10, then replace \037
with \012
in the above code.
Note that 0-10 aren't really the only codes that would suggest binary data; anything below \040
or above \0177
usually suggests binary data.
Upvotes: 2
Reputation: 9466
If you want to check that the string contains only characters between chr(0)
and chr(10)
, simply use
re.match('^[\0-\x0A]*$',data)
For Python3, you can do the same with byte strings:
re.match(b'^[\0-\x0A]*$',b'\x01\x02\x03\x04')
Upvotes: 5