Reputation: 5227
I have a file with multiple jpegs inside. So i would like to split them to single jpegs.
The easy part is to find the beginning: 0xFF0xD8 0xFF0xE1
marks the beginning of the JPG and the EXIF Data field, which is in my case always at the beginning.
So I found this awk command:
awk '/string/{n++}{print >"out" n ".txt" }' final.txt
To split the files. Which does not work as expected when I use it with hex:
awk '/0xFF0xD8 0xFF0xE1/{n++}{print >"out" n ".txt" }' final.txt
The doc of awk says that all strings with 0x in front are used as hex but I seems not working well..
Edit: well i found this: https://superuser.com/questions/174362/how-to-split-binary-file-based-on-pattern but it does not work for me... it should create 2 files, but only one is created and its only 11 Bytes big
Upvotes: 0
Views: 3610
Reputation: 212504
perl if probably the preferred tool, but awk can handle it just fine:
awk '{print > "out" NR ".jpg"}' RS=$( printf '\xff\xd8\xff\xe0' )
Upvotes: 0
Reputation: 2593
Are you sure awk handles binary files well? I thought it would expect newlines.
Perl can use hex escapes in regexes (Basic idea from this answer):
#!/usr/bin/perl
undef $/;
$_ = <>;
$n = 0;
for $content (split(/(?=\xFF\xD8\xFF\xE0)/)) {
open(OUT, ">out" . ++$n . ".txt");
print OUT $content;
close(OUT);
}
Upvotes: 1