Reputation: 501
I'm using 7z version 18.05 and I would like to list only filenames of an archive content.
If I use the command 7z l myArchive.7z
i get this output:
7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30
Scanning the drive for archives:
1 file, 146863932 bytes (141 MiB)
Listing archive: myArchive.7z
--
Path = myArchive.7z
Type = 7z
Physical Size = 146863932
Headers Size = 393
Method = LZMA:26
Solid = +
Blocks = 1
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2017-12-06 08:55:47 D...A 0 0 myArchive
2017-12-06 08:55:42 D...A 0 0 myArchive\folder
2017-12-05 19:50:41 ....A 21816530 146863539 myArchive\folder\Test.dat
2017-12-06 08:55:42 ....A 21877463 myArchive\folder\Test2.dat
2017-12-05 19:51:05 ....A 153953 myArchive\folder\Test3.dat
2017-12-05 19:50:41 ....A 4193 myArchive\folder\Test4.dat
2017-12-06 08:55:47 ....A 24128956 myArchive\log.txt
2017-12-06 08:55:47 ....A 79980 myArchive\readme.txt
2017-12-05 19:51:05 ....A 3256759999 myArchive\folder\zTest.txt
------------------- ----- ------------ ------------ ------------------------
2017-12-06 08:55:47 3324821074 146863539 7 files, 2 folders
I don't know why 7z doesn't have a switch to list only filename. How to get only "Name" column? Any suggest with a dos command?
Upvotes: 11
Views: 30641
Reputation: 37206
Based on @Philippe's comment, it seems my original answer did not properly handle file paths that contain spaces. I took a look at the 7z source code and confirmed as I suspected that the data columns are of fixed width. The file path always starts after the 53rd character.
Therefore, you can use the following command to list all of the files (including those with spaces) in a given archive named archive.7z
:
7z l -ba archive.7z | grep -vF 'D....' | grep -oP '(?<=^.{53}).*'
The first grep command removes directory entries from the list and the second grep command skips the first 53 characters and prints the remainder of each line, which will be the full file path including spaces.
If you want to print all of the directory paths in addition to the file paths, then simply remove the first grep command:
7z l -ba archive.7z | grep -oP '(?<=^.{53}).*'
I was able to do this by using the -ba
flag to get output with a single line for each item in the archive, then using grep to parse each line to get only the filenames.
Consider this example, listing the contents of an archive that contains a single file baz
nested within two levels of directories foo
and bar
.
user@host:~$ 7z l -ba 'archive.7z'
which resulted in the following output:
2021-06-21 14:37:09 D.... 0 0 foo
2021-06-21 14:37:41 D.... 0 0 foo/bar
2021-06-21 14:37:41 ....A 881 524 foo/bar/baz
Then, using grep to only get the path at the end of each line:
user@host:~$ 7z l -ba 'archive.7z' | grep -oP '\S+$'
giving the output:
foo
foo/bar
foo/bar/user1.lnk
If you wanted to list all items including directories, then you're finished with just the above. In my case, I was actually trying to get a preview of the items that would be extracted using 7z e
, which extracts only files without the directory structure, so I added:
user@host:~$ 7z l -ba 'archive.7z' | grep -vF 'D....' | grep -oP '\S+$' | xargs basename
which gives my desired output; for this example:
user1.lnk
Upvotes: 6
Reputation: 570
I was searching for an answer to this exact problem, and found the answer within the link provided by Nisse Knudsen, but for me the -ba
undocumented switch did not work on its own to do what I needed it to, nor does this switch alone (nor Nisse's answer) appear to fully answer the OP's question - OP wants to know how to get just the Name (of each file) and -ba
alone will not always work when parsing data from a .7z
file. I would have made a comment rather than a full answer, but I have not earned enough rep to comment and I believe this information is still relevant and accurate to present, and my "comment" would have been too long anyway.
Referencing the link provided by Nisse (https://superuser.com/a/1073272/542975) using the -slt
switch formats the output in a much more readable format (for looping/parsing purposes) which a simple For /f
loop in a batch file can parse and give you what is needed.
Let me list a few changes in output for you to see what each switch is doing.
THIS: 7z.exe l "C:\Some Directory\Some FileIZipped.zip"
7-Zip [64] 15.12 : Copyright (c) 1999-2015 Igor Pavlov : 2015-11-19
Scanning the drive for archives:
1 file, 9986888 bytes (10 MiB)
Listing archive: C:\Some Directory\Some FileIZipped.zip
--
Path = C:\Some Directory\Some FileIZipped.zip
Type = zip
Physical Size = 9986888
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2017-07-18 12:19:04 ....A 240789 109401 A_RandomFile.doc
2017-07-05 13:32:42 ....A 19148800 9877487 Another_Random File with Spaces.mov
------------------- ----- ------------ ------------ ------------------------
2017-07-18 16:30:44 19389589 9986888 2 files
Becomes THIS: 7z.exe l -slt "C:\Some Directory\Some FileIZipped.zip"
7-Zip [64] 15.12 : Copyright (c) 1999-2015 Igor Pavlov : 2015-11-19
Scanning the drive for archives:
1 file, 9986888 bytes (10 MiB)
Listing archive: C:\Some Directory\Some FileIZipped.zip
--
Path = C:\Some Directory\Some FileIZipped.zip
Type = zip
Physical Size = 9986888
----------
Path = A_RandomFile.doc
Folder = -
Size = 240789
Packed Size = 109401
----[10 lines of jargon removed for clarity]----
Path = Another_Random File with Spaces.mov
Folder = -
Size = 19148800
Packed Size = 9877487
----[10 lines of jargon removed for clarity]----
Adding in the -ba
command simplifies the format a little further, preventing the need to skip the header lines (I reference this in comments in the for loop as shown in the script sample at the end).
This further becomes: 7z.exe l -ba -slt "C:\Some Directory\Some FileIZipped.zip"
Path = A_RandomFile.doc
Folder = -
Size = 240789
Packed Size = 109401
----[10 lines of jargon removed for clarity]----
Path = Another_Random File with Spaces.mov
Folder = -
Size = 19148800
Packed Size = 9877487
----[10 lines of jargon removed for clarity]----
I am using this as a method of file-comparing an archive (zip/7z/rar) against the actual directory to make a mirror copy where the directory is the master. To do this I am parsing a file containing the output of my 7z list command. I suppose I could iterate the for loop directly from the 7z command instead, but I have found this to be slower in some situations when there's a large amount of data within the archives.
I have had multiple instances where trying to parse the standard output fails - it occurs when listing contents of a .7z
archive as shown below. This is not -EASILY- resolved using a for loop parsing for spaces. What would be Token 5
for most lines (showing as the Compressed Space) end up becoming the filename which is reserved in .zip
format archives as Token 6
so then you have a very messy situation which is a nightmare to plan for. This is also the exact problem the OP is referencing in the provided example given.
Example similar to what OP Provided:
7-Zip [64] 15.12 : Copyright (c) 1999-2015 Igor Pavlov : 2015-11-19
Scanning the drive for archives:
1 file, 446600 bytes (437 KiB)
Listing archive: C:\Some Directory\Some _OTHER_ FileIZipped.7z
--
Path = C:\Some Directory\Some _OTHER_ FileIZipped.7z
Type = 7z
Physical Size = 446600
Headers Size = 283
Method = LZMA2:22
Solid = +
Blocks = 1
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2020-08-28 15:06:46 D.... 0 0 SomeDirectoryInside
2020-08-28 15:06:46 D.... 0 0 SomeDirectoryInside\OtherDir
2020-08-28 15:06:46 D.... 0 0 SomeDirectoryInside\Zips
2020-08-28 15:13:14 ..... 1064960 446317 SomeDirectoryInside\Zips\Some_File.Doc
2020-08-28 15:08:02 ..... 313080 SomeDirectoryInside\Zips\Some_Other_File.Doc
2020-08-28 15:07:34 ..... 1561728 SomeDirectoryInside\Zips\Foo.mov
2020-08-28 15:07:46 ..... 262144 SomeDirectoryInside\Zips\Fancy.Doc
2020-08-28 15:07:26 ..... 262144 SomeDirectoryInside\Zips\Fancy2.Doc
------------------- ----- ------------ ------------ ------------------------
2020-08-28 15:13:14 3464056 446317 5 files, 3 folders
Below is a batch script sample I wrote to put the 7z.exe
output into a file and then pulling the data from the file and getting just what I need. Forgive the multiple REM lines - I prefer this method of commenting instead of long single -line strings so readers do not have to scroll the code block to the right in order to read.
Because of how For /f
iterates through data, we need to ensure token %%c
is not blank. I am using this method because sometimes our files have spaces in the names, and we are parsing the 7z
output using Spaces as the Delimiter.
Token 3*
will give you two separate tokens you can check -- Tokens %%b
[ Token 3 ] and %%c
[ Token * ] - if %%c
is blank - we know %%b
has no spaces and can safely be echoed to whichever file we need or set as a variable to use later, etc.
@Echo Off
REM Sending the output of 7z into a file to use later
7z.exe l -slt "SomeFileIZipped.zip" >"ZipListRAW.txt"
REM Example of 7z.exe command with '-ba' switch
REM 7z.exe l -ba -slt "SomeFileIZipped.zip"
REM If you do not use '-ba' in the 7z command above, you can simply skip the first
REM 11-12 lines of the file to get ONLY the filenames (skips past first line containing
REM "Path" which contains the original archive filename.
For /f "Usebackq Skip=11 Tokens=1,3* Delims= " %%a in ("ZipListRAW.txt") do (
REM Checking if %%a equals word "Path"
If "%%a"=="Path" (
If [%%c]==[] (
Echo %%b
) ELSE (
Echo %%b %%c
)
)
)
Upvotes: 5
Reputation: 738
If you don't like the sound of using an undocumented command line switch, you can do the following to parse out the filename from the full output. This awk script determines the start index of the Name
column header, and uses that to extract the column from the table.
awk_script='{
if (ix == 0) {
ix = index($0, "Name");
}
p = (body == 1);
if (ix > 0) {
# The table body is delimited by dashed lines, after the "Name" column header has been seen
body = (body + ($0 ~ / *-[ -]+/)) % 2;
}
if (p == 1 && body == 1) {
# Only print if "body" was 1 before and after the previous block; otherwise, we are in
# the table body delimiter line (or outside the table completely)
print substr($0, ix);
}
}'
7z l your-file.7z | awk "$awk_script"
PowerShell equivalient:
$ix=-1;
$body=$false;
& 7z l your-file.7z | foreach { `
if ($ix -eq -1) {`
$ix = $_.IndexOf("Name");`
}`
$p = $body;`
if ($ix -gt 0) {`
# The table body is delimited by dashed lines, after the "Name" column header has been seen`
$body = ($body -ne ($_ -match ' *-[ -]+'))
}`
if ($p -and $body) {`
# Only print if "body" was 1 before and after the previous block; otherwise, we are in`
# the table body delimiter line (or outside the table completely)`
write-output $_.Substring($ix)`
}`
}
Upvotes: 0
Reputation: 328
Found this answer in a different thread: https://superuser.com/a/1073272/542975
There is an undocumented switch -ba
which removes all of the header and table formatting, and only lists the row entries.
From there, you could parse every line and split it by whitespaces or tabs, or potentially go with a regex.
Upvotes: 23
Reputation: 16256
If you can install a PowerShell module on your machine, listing the file names is easy enough. This can be done on any modern-day, supported Windows system.
https://www.powershellgallery.com/packages/7Zip4Powershell/1.9.0 describes how to install the module.
Here is a .bat file script showing its usage and output.
C:>TYPE zipfnlist.bat
@ECHO OFF
SET "ZIP_FILENAME=.\7zIntf20.zip"
powershell -NoLogo -NoProfile -Command (Get-7Zip -ArchiveFileName "%ZIP_FILENAME%").FileName
C:>CALL zipfnlist.bat
bin
Properties
Ole32.cs
Program.cs
SevenZipFormat.cs
SevenZipInterface.cs
SevenZip.csproj
SevenZip.sln
bin\Debug
bin\Release
bin\Release\SevenZip.exe
Properties\AssemblyInfo.cs
Upvotes: -1