Reputation: 12710
I want to get a part of a binary file, from byte #480161397 to #480170447 (included, 9051 bytes in total)
I use cut -b
, and I expected the size of trunk1.gz to be 9051 bytes, but I get a different result.
$ wget https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2016-07/segments/1454701152097.59/warc/CC-MAIN-20160205193912-00264-ip-10-236-182-209.ec2.internal.warc.gz
$ cut -b480161397-480170447 CC-MAIN-20160205193912-00264-ip-10-236-182-209.ec2.internal.warc.gz >trunk1.gz
$ echo $((480170447-480161397+1))
9051
$ ls -l trunk1.gz
-rw-r--r-- 1 david staff 3400324 Sep 8 10:28 trunk1.gz
What is wrong?
Upvotes: 2
Views: 261
Reputation: 13249
If you work with binary, I advise you to use dd
command.
dd if=trunk1.gz bs=1 skip=480161397 count=9051 of=output.bin
bs
is the block size and is set to 1 byte.
Upvotes: 1
Reputation: 32474
cut -bN-M
copies the range N-M
bytes from every line of the input.
Example:
$ cut -b4-7 <<END
0123456789
abcdefghij
ABCDEFGHIJ
END
Output:
3456
defg
DEFG
Consider using dd
for your purposes.
Upvotes: 2