Reputation: 4435
I am studying the gzip format, and I tried to grep its magic bytes, 1F 8B, in a sample archive. I used the manual from this answer.
xxd a.gz
Output:
00000000: 1f8b 0800 43dc 605b 0003 4bcb cf4f 4a2c ....C.`[..K..OJ,
00000010: e202 0047 972c b207 0000 00 ...G.,.....
grep -obUaP "\x1f" a.gz
Output:
0:
grep -obUaP "\x8b" a.gz
Output:
# Nothing is printed
For some reason, grep finds one byte and does not find another. After some investigation, we had a blind guess that it fails on bytes with the most significant bit set. However, we couldn't find any reasonable explanation.
Why does it happen and is there a workaround?
Upvotes: 2
Views: 154
Reputation: 798606
Probably because grep
is working with UTF-8; when you search for "\x8b" it's looking for 0xc2 0x8b. You will need to either find some way to disable grep's UTF-8 support, or switch to a tool that strictly interprets the search criteria as binary values.
Upvotes: 3