Reputation: 657
I have binary files and I want to delete everything before (and including) a certain sequence of bytes (five times '7e'). For example I have a file test:
hexdump test
0000000 000a 4ffa 0a0d 7e7e 7e7e 837e 646f 0110
0000010 8318 dac3
0000014
The result should be:
hexdump test1
0000000 6f83 1064 1801 c383 00da
0000009
I tried it with cat test | sed 's/.*~~~~~//'
however, it only deleted the '~~~~~' and keeps the rest.
Upvotes: 1
Views: 339
Reputation: 44023
Using sed with binary files is not going to end well, since it does some locale- and encoding-dependent things and generally expects to work on text files. There is another utility, bbe
(binary block editor), that is better suited to this task. With it, you can do this:
bbe -b ':/~~~~~/' -e 'D 1' test
This states that blocks are units that end with ~~~~~
and instructs bbe
to delete the first of them (D 1
).
The problem you run into with sed, discounting encoding snafu, is that sed works line by line. If you are hell-bent on doing it with sed (in which case you can expect random failures), this might work on some platforms:
sed '1,/~~~~~/ { /~~~~~/!d; s/^.*~~~~~// }' test
This will, in the pattern range 1,/~~~~~/
(from the first line to the first that contains ~~~~~
) delete lines that do not contain ~~~~~
and remove the part up to ~~~~~
from the line that eventually does. This is more brittle than the bbe
approach in more ways than one; apart from the encoding snafu, it will break if ~~~~~
appears twice between two 0a
(newline) bytes. If this is for serious use, go with bbe
.
Upvotes: 1