curiosity22
curiosity22

Reputation: 31

Batch extracting specific file types from master directory with many subdirectories

I have a quantity of folders with archive files, Parent folder with subfolder eg

Graphics:

etc each folder contains an archive - various names numbers. each archives contains various files, pdf's txt files invoices and image files. Generally the images are .jpg named various names.

What I would like to do is batch attack the parent folder to extract an image file/s from the each archive from each sub directory and leave the image in the subdirectory with the archive where it came from. If the archive has multiple images that's fine, I am not targeting a single particular image.

hopefully ending up with something like

Graphics:

I would rather avoid if possible extracting all the files separating the images then having to re archive.

What I tried to discover originally was to batch extract the image files to the directory it belongs to, have the image file renamed to its directory name. I didn't get close with a solution, so I think if possible just extracting the image would be fine and I can use a renaming app to do the other I've found bulk rename utility to be just fine once I got my head around it. You wouldn't think that over the years you would collect so many archives, like small drops they ended up become an ocean full.

I have tried researching stack and seen a lot of examples of how eg 7zip works but I just cant get my head quite around it.

I am due to retire they tell me 65 is the time for the chicken coop, I've been a pencil pusher and mouse skater most of my life in the gfx industry. I used to know what was in each archive but memory is a little how to say... rusty nowadays, I know all my archives have images in them. My life would be a lot easier in the sunset of it to look at the pictures and not have to rack my brains trying to remember what was in the archive itself.

Cheers and ty in advance from the colonies downunder.

Grumpy

Upvotes: 2

Views: 1045

Answers (3)

curiosity22
curiosity22

Reputation: 31

I found the solution to my issue, no coding just using an app I've been using for years. Total Commander.

Solution was simple in the end.

I open up Total Commander, do a search for the archive files I want Alt F7 it will list in the right hand frame all my archives .7z .zip whatever you have.

Then you select "feed to list box" found in the bottom right hand corner. Do a Ctrl A then Alt F9 which gives you some options.

You clear unpack specific files from archive to "make sure its blank" then files to unpack tell it what your looking for in my case .jpg (it can be any specific file).

Untick unpack path names if stored with files tick or untick overwrite existing files untick unpack each archive to a separate subdir (name of the archive) Hit ok.

It will then search and find the file/s you are looking for and unpack them in the directory/subdirectory/etc they are found in.

Job done... No coding just using TC.. marvellous app

Upvotes: 1

Magoo
Magoo

Reputation: 79947

@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION 
rem The following settings for the source directory, destination directory, target directory,
rem batch directory, filenames, output filename and temporary filename [if shown] are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.

SET "sourcedir=u:\your files"
SET "destdir=u:\your results"
SET "outfile=%destdir%\outfile.txt"

SET "extensions=bat txt"
SET "archives=7z zip"

CALL :prefix sdexts *. %extensions%
CALL :prefix sdarchives *. %archives%
CALL :prefix dexts . %extensions%
CALL :prefix darchives . %archives%

PUSHD "%sourcedir%"

(
FOR /f "delims=" %%b IN ('dir /b /s /a-d %sdexts% %sdarchives%') DO (
 SET "unreported=%%b"
 FOR %%o IN (%darchives%) DO IF /i "%%~xb"=="%%o" CALL :procarch%%o&SET "unreported="
 IF DEFINED unreported ECHO %%~dpb ^| %%~nxb
)
)>"%outfile%"

popd

GOTO :EOF

:: return %1 with each other argument preceded by '%2'

:prefix 
SET "$1=%1="
shift
SET "$2=%~1"
:sdl
SHIFT
IF "%1"=="" SET "%$1%"&GOTO :eof
SET "$1=%$1% %$2%%1"
GOTO sdl

:: Process .7z or .zip archives

:procarch.7z
:procarch.zip
SET "skipme=Y"
FOR /f "delims=" %%e IN ('7z L "%unreported%"') DO (
 FOR /f "tokens=3delims= " %%o IN ("%%e") DO (
 IF "%%o"=="------------" (
  IF DEFINED skipme (SET "skipme=") ELSE (SET "skipme=Y")
 ) ELSE IF NOT DEFINED skipme ECHO "%%o"|FIND "D" >NUL &IF ERRORLEVEL 1 (
  SET "filename=%%e"
  FOR /f "delims=" %%y IN ("!filename:~53!") DO (
   FOR %%c IN (%dexts%) DO IF /i "%%c"=="%%~xy" ECHO %unreported% ^| %%y
  )
  )
 )
)

GOTO :eof

I've a similar requirement, so I spent a bit of time on this...

The first part simply sets the directories to be used.

I used bat and txt as extensions required for testing. Change as desired - just a space-separated list.

Similarly, I chose to examine .zip and .7z archives. Since 7zip can deal with both, the same subroutine can be used to process both. Other archive types may need different processing.

The code uses dir to locate the files of interest, and dir can take a series of arguments, so the routine :prefix converts the list provided as %3+ to a list in variable %1, each term prefixed by the string at %2.

So - sdexts will become *.bat *.txt and dexts .bat .txt for instance.

Next, switch to the source dir to begin the dir to list the required files. For each name returned in %%b (delims= applies the full name, /b basic list (names only), /s processes subdirectories, /a-d suppresses directorynames) - the flag unreported is set to the filename found then the extension of the filename found is compared to the darchives list. If a match is found, execute the procarch%%o routine is executed and the unreported flag set to empty. Then, if unreported has not been cleared, it's not an archive file, so report it.

procarch%%o will be resolved either procarch.7z or procarch.zip and a routine (actually the same routine) is provided.

The :prefix routine sets $1 to %1=, so for example, sdexts=. Then it shifts the parameter list and sets $2 to the prefix to be placed before each term.

Then shift again. If %1 is not empty, append a space and the prefix in $2 and %1 to $1 and repeat.

When %1 becomes empty, execute the command set "%$1%" and finish, so as $1would then contain the stringsdexts= *.bat *.txt, sdexts` would be set to the list required.

The routine :procarch.7z or procarch.zip (the first simply falls through to the second, so the same processing is executed) sets the flag skipme to non-empty as we need to skip 7z's header data.

%%e acquires each line of the 7z L report in turn. %%o is assigned the third token of the 7z report line. This will be junk up to the line with a series of dashes, then the attribute report for the archived items, then another series of dashes, then junk.

So - if %%o is a string of dashes, simply switch skipme to empty/not empty.

If %%o is not dashes, and skipme is not defined then we are between the two lines of dashes (ie. on an archived item line) so we see whether %%o contains D (which would indicate that this is a directory name, which we do not want to report). If it does not, errorlevel will be set non-zero by the find (output of the find is sent to nul=the æther) filename set to the full report line. We're not interested in the first 54 characters, but the remainder contains the archived filename, so see whether the extension of the filename found is one of interest (in dexts) and if it is, report it.

The ( on the line preceding the for ... %%b matches the )>"%outfile%" and sends all output that would otherwise go to the console to the file named.

And the ^| on the echo lines output a literal | character as | is a special character for cmd and needs to be escaped by ^ to be interpreted as a literal.

Upvotes: 0

K J
K J

Reputation: 11739

To answer your question the task is simple involving For loops with recursion, however to be robust the solution will be complex without knowing how those specific long term possibly mixed, 7zip files are subdivided, thus if a 7zip has itself two sub folders with identical named files you will hit error conditions. I have allowed for that using -aou to auto rename if necessary. however I have not added the folder name to each file as that's an extra step.

@echo off & Title Extract Images from 7z files
set "extractor=C:\Program Files (x86)\7-Zip\7z.exe"
set "archives=*.7z"
set "filetypes=*.png *.jpg *.jpeg *.tif *.tiff"
set "startDir=C:\Users\WDAGUtilityAccount\Desktop"

FOR /R "%startdir%" %%I IN (%archives%) DO ( "%extractor%" e "%%I" -o"%%~dpI" %filetypes% -aou)

You can add extra archive types such as .rar or .zip in the same way I have allowed for different image file types. HOWEVER do test for such variations first otherwise a second run will invoke that autonaming of existing duplicates. You can of course change that -aou to -aos to avoid overwite or remove it as desired.

An alternative approach on windows is that, standard windows.zip files can be navigated and thus kept as compressed folders where a click on each file will open in the default application. Thus only need extraction to the folder when you wish to use a non default app.

enter image description here

The secondary advantage more directly related to your need is that some viewers can read just the images and ignore any other contents (even if they can read text or html) and one I support that reads images in zips is SumatraPDF and although not listed it can open .7Z files to show in the same way

enter image description here

Hence my answer to the posed question is I suggest converting 7zip to standard zip rather than expand them, or use SumatraPDF as default/alternative 7z viewer !!.

Upvotes: 2

Related Questions