Reputation: 2755
I found that I can sandwich .7z
archive files between other, binary data and 7za
is still able to list and (presumably) extract files from this archive. I think this is useful eg to create a quintuple-format file such as a DOS kernel, DOS application, DOS device driver all at the beginning, .7z
(or .zip
) archive in the middle, and an appended Extension for lDebug (ELD) at the end.
I'd like to find the exact conditions for finding the archive but this seems to not be well documented anywhere. For instance, https://py7zr.readthedocs.io/en/latest/archive_format.html lists:
Signature
The first six bytes of a 7-zip file SHALL always contain
b'7z\\xbc\\xaf\\x27\\x1c'
.
This may describe the format but my test indicates the archive/"file" is indeed found when sandwiched as well, so that the signature doesn't appear at the very beginning of the actual file.
It is similarly not specified how to search for the signature in this file of the 7-Zip sources: https://github.com/ip7z/7zip/blob/e5431fa6f5505e385c6f9367260717e9c47dc2ee/DOC/7zFormat.txt#L168
7z format headers
SignatureHeader
BYTE kSignature[6] = {'7', 'z', 0xBC, 0xAF, 0x27, 0x1C};
I looked some at the sources but was unable to find the exact handling for this. Bits that seem relevant:
static Byte k_Signature_Dec[kSignatureSize] = {'7' + 1, 'z', 0xBC, 0xAF, 0x27, 0x1C};
REGISTER_ARC_IO_DECREMENT_SIG(
"7z", "7z", NULL, 7,
k_Signature_Dec,
0,
NArcInfoFlags::kFindSignature
The kFindSignature flag seems to be read in https://github.com/ip7z/7zip/blob/e5431fa6f5505e385c6f9367260717e9c47dc2ee/CPP/7zip/UI/Common/LoadCodecs.h#L145
bool Flags_FindSignature() const { return (Flags & NArcInfoFlags::kFindSignature) != 0; }
Which is used in https://github.com/ip7z/7zip/blob/e5431fa6f5505e385c6f9367260717e9c47dc2ee/CPP/7zip/UI/Common/OpenArchive.cpp#L2115
{
const CArcInfoEx &ai = op.codecs->Formats[(unsigned)formatIndex];
if (ai.FindExtension(extension) >= 0)
{
if (ai.Flags_FindSignature() && searchMarkerInHandler)
return S_FALSE;
}
}
I don't understand how this implements a search for the signature however.
Here's a test case using 7za v16.02 on a Linux host's command line:
$ dd if=/dev/random of=prefix bs=1024 count=10
10+0 records in
10+0 records out
10240 bytes (10 kB, 10 KiB) copied, 0.000207353 s, 49.4 MB/s
$ dd if=/dev/random of=suffix bs=1024 count=10
10+0 records in
10+0 records out
10240 bytes (10 kB, 10 KiB) copied, 0.000172483 s, 59.4 MB/s
$ touch foo
$ 7za -t7z a foo.7z foo
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs AMD EPYC-Rome Processor (830F10),ASM,AES-NI)
Scanning the drive:
1 file, 0 bytes
Creating archive: foo.7z
Items to compress: 1
Files read from disk: 0
Archive size: 82 bytes (1 KiB)
Everything is Ok
$ cat prefix foo.7z suffix > middle.bin
$ 7za l middle.bin
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs AMD EPYC-Rome Processor (830F10),ASM,AES-NI)
Scanning the drive for archives:
1 file, 20562 bytes (21 KiB)
Listing archive: middle.bin
--
Path = middle.bin
Type = 7z
WARNINGS:
There are data after the end of archive
Offset = 10240
Physical Size = 82
Tail Size = 10240
Headers Size = 82
Solid = -
Blocks = 0
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2024-12-25 21:26:30 ....A 0 0 foo
------------------- ----- ------------ ------------ ------------------------
2024-12-25 21:26:30 0 0 1 files
Warnings: 1
$
Question: What are the exact conditions for finding the .7z archive in the middle of a file?
Upvotes: 0
Views: 36