user3520363
user3520363

Reputation: 380

Order and move files into directories based on some filenames pattern

To move files into folders I use this script

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "SPLITCHAR=-"  & rem // (a single character to split the file names)
set "SEARCHSTR=_"  & rem // (a certain string to be replaced by another)
set "REPLACSTR= "  & rem // (a string to replace all found search strings)
set "OVERWRITE="   & rem // (set to non-empty value to force overwriting)

rem // Get file location and pattern from command line arguments:
set "LOCATION=%~1" & rem // (directory to move the processed files into)
set "PATTERNS=%~2" & rem // (file pattern; match all files if empty)

rem /* Prepare overwrite flag (if defined, set to character forbidden
rem    in file names; this affects later check for file existence): */
if defined OVERWRITE set "OVERWRITE=|"
rem // Continue only if target location is given:
if defined LOCATION (
    rem // Create target location (surpress error if it already exists):
    2> nul md "%LOCATION%"
    rem /* Loop through all files matching the given pattern
    rem    in the current working directory: */
    for /F "eol=| delims=" %%F in ('dir /B "%PATTERNS%"') do (
        rem // Process each file in a sub-routine:
        call :PROCESS "%%F" "%LOCATION%" "%SPLITCHAR%" "%SEARCHSTR%" "%REPLACSTR%"
    )
)

endlocal
exit /B


:PROCESS
rem // Retrieve first argument of sub-routine:
set "FILE=%~1"
rem // Split name at (first) split character and get portion in front:
for /F "delims=%~3" %%E in ("%~1") do (
    rem // Append a split character to partial name:
    set "FOLDER=%%E%~3"
)
setlocal EnableDelayedExpansion
rem // Right-trim partial name:
if not "%~4"=="" set "FOLDER=!FOLDER:%~4%~3=!"
set "FOLDER=!FOLDER:%~3=!"
rem /* Check whether partial name is not empty
rem    (could happen if name began with split character): */
if defined FOLDER (
    rem // Replace every search string with another:
    if not "%~4"=="" set "FOLDER=!FOLDER:%~4=%~5!"
    rem // Create sub-directory (surpress error if it already exists):
    2> nul md "%~2\!FOLDER!"
    rem /* Check if target file already exists; if overwrite flag is
    rem    set (to an invalid character), the target cannot exist: */
    if not exist "%~2\!FOLDER!\!FILE!%OVERWRITE%" (
        rem // Move file finally (surpress `1 file(s) moved.` message):
        1> nul move /Y "!FILE!" "%~2\!FOLDER!"
    )
)
endlocal
exit /B

To use script I must

1- open cmd
2- to execute batch i have to

cd C:\Users\Administrator\Desktop\T\
"C:\Users\Administrator\Desktop\T\build-folder-hierarchy.bat" "C:\Users\Administrator\Desktop\T\" "*.pdf"

But problem what is ?

For each .pdf file batch creates a relative folder but I don't want it creates folders in that way. Look https://i.sstatic.net/bRsPc.png

aaaa aaaa S02 [folder]
aaaa aaaa S02e01.pdf [folder]
aaaa aaaa S02e02.pdf [folder]
aaa.aaaa.aaa.aa.aaaaa.S02 [folder]

What I want instead ?

├─aaaa aaaa S02 [folder]
│ ├─aaaa aaaa S02e01.pdf[file]
│ ├─aaaa aaaa S02e02.pdf [file]
  └─ ....
├─aaa.aaaa.aaa.aa.aaaaa.S02 [folder]
│ └─aaa.aaaa.aa.aa.aaaaa.S02E13.pdf [file]
:

Just an example name to understand how .pdf files name are formatted

aaaaaaaaa aa aaaaa S01e12 720p Repack.pdf
aaa aaaaaaaaa S01e05 Versione 720p.pdf
aaa aaaaaaaa S01e05 Versione 1080p.pdf
aaa aaaa s2e06.pdf
aaa aaaa S03e12.pdf
aaa.aaaa.aaa.on.Earth.S02E13.pdf
aaa.aaaa.aaaa.S02E01.HDTV.x264.SUB.ITA.pdf

Usually pdf files name are formatted in this way [pattern]

s01
s01e1
s1
s1e1
s1e01
s1e01-10

character, like the e and s are almost always present within these patterns name general form should be

sxx
sxxex
sx
sxex
sxexx
sxexx-xx

X is a number, case for letter s and e is irrilevant

Powershell solution is well accepted for answer.

Upvotes: 2

Views: 1542

Answers (2)

Aacini
Aacini

Reputation: 67216

Your question is confusing. You have not described the format of the file names, but just show some examples and using examples instead of specifications may be misunderstood. Post code wrote for other problem that don't work on this one is certainly not useful. You did not showed an example of the input and wanted output using real file names, so it is possible that a solution based on the example data will not work on real data.

EDIT: New specification added. Both specifications and program code have been modified accordingly to a request given in comments.

Below there are the specifications of this problem as I understand they:


"Given a series of *.pdf files with this format:

any string hereS##Eany string here.pdf
              / | ^-- "E" letter
    "S" letter  digit

extract the string that ends before the "E" after the "S-digit" delimiter and move the file to a folder with such a name, "S" and "E" letters are not case sensitive. Ignore files that have not the previous format."

This code solve the problem based on such specifications:

@echo off
setlocal EnableDelayedExpansion

rem Change current directory to the one where this .bat file is located
cd "%~P0"

set "digits=0123456789"

rem Process all *.pdf files
for %%f in (*.pdf) do (

   rem Get the folder name of this file
   call :getFolder "%%f"

   rem If this file have a properly formatted name: "headS##Etail"
   if defined folder (
      rem Move the file to such folder
      if not exist "!folder!" md "!folder!"
      move "%%f" "!folder!"
   )

)
goto :EOF


:getFolder file

set "folder="
set "file=%~1"
set "head="
set "tail=%file%"

:next
   for /F "delims=%digits%" %%a in ("%tail%") do set "head=%head%%%a"
   set "tail=!file:*%head%=!"
   if not defined tail exit /B
   if /I "%head:~-1%" equ "S" goto found
   :digit
      if "!digits:%tail:~0,1%=!" equ "%digits%" goto noDigit
      set "head=%head%%tail:~0,1%"
      set "tail=%tail:~1%"
   goto digit
   :noDigit
goto next

:found
for /F "delims=Ee" %%a in ("%tail%") do set "folder=%head%%%a"
exit /B

To use this Batch file, place it on the same folder where the original files are located and execute it without parameters; you may also execute it via a double-click in the explorer. Example session:

C:\Users\Antonio\Documents\test> dir /B
test.bat
The_Good_Wife_S06e15.pdf
The_Good_Wife_S06e22.pdf
TOCCO_ANGELO_4.pdf
True Blood S07e07_001.pdf
True Detective S02E03-04 Repack.pdf
True Detective S02e03.pdf
True Detective S02e03_001.pdf
True.Detective.S02e02.1080p.WEBMux.pdf
Tudors S04e08.pdf
Tutti pazzi per amore s3e15-16.pdf
Tutto Pu‗ Succedere S01e01-02.pdf
Twin Peaks s1e1-8.pdf
Twin Peaks s2e16-22.pdf
Tyrant S02e07.pdf
Tyrant.S01e01_02.720p.DLMux.pdf
Ultimo 2 - La Sfida.pdf
Ultimo 3 -L Infiltrato.pdf
Una Mamma Imperfetta S02e01-13.pdf
Under the Dome S02e02 Versione 720p.pdf
Under.the.Dome.S03E07.HDTV.x264.SUB.ITA.pdf

C:\Users\Antonio\Documents\test> test.bat

C:\Users\Antonio\Documents\test> tree /F
Listado de rutas de carpetas
El número de serie del volumen es 00000088 0895:160E
C:.
│   test.bat
│   TOCCO_ANGELO_4.pdf
│   Ultimo 2 - La Sfida.pdf
│   Ultimo 3 -L Infiltrato.pdf
│
├───The_Good_Wife_S06
│       The_Good_Wife_S06e15.pdf
│       The_Good_Wife_S06e22.pdf
│
├───True Blood S07
│       True Blood S07e07_001.pdf
│
├───True Detective S02
│       True Detective S02E03-04 Repack.pdf
│       True Detective S02e03.pdf
│       True Detective S02e03_001.pdf
│
├───True.Detective.S02
│       True.Detective.S02e02.1080p.WEBMux.pdf
│
├───Tudors S04
│       Tudors S04e08.pdf
│
├───Tutti pazzi per amore s3
│       Tutti pazzi per amore s3e15-16.pdf
│
├───Tutto Pu‗ Succedere S01
│       Tutto Pu‗ Succedere S01e01-02.pdf
│
├───Twin Peaks s1
│       Twin Peaks s1e1-8.pdf
│
├───Twin Peaks s2
│       Twin Peaks s2e16-22.pdf
│
├───Tyrant S02
│       Tyrant S02e07.pdf
│
├───Tyrant.S01
│       Tyrant.S01e01_02.720p.DLMux.pdf
│
├───Una Mamma Imperfetta S02
│       Una Mamma Imperfetta S02e01-13.pdf
│
├───Under the Dome S02
│       Under the Dome S02e02 Versione 720p.pdf
│
└───Under.the.Dome.S03
        Under.the.Dome.S03E07.HDTV.x264.SUB.ITA.pdf

This code would be much simpler if the file name before the "S" delimiter can not have digits. This solution assumes that there are not exclamation-marks ! in the file names.

Upvotes: 1

SomethingDark
SomethingDark

Reputation: 14305

The easiest way to get the last instance of a regex substring is to break the string into chunks and process the chunks in reverse order.

:: A script for grouping PDF files based on book series name
:: http://i.imgur.com/seh6p.gif

@echo off
setlocal enabledelayedexpansion
cls

:: Main Directory Containing PDF Directories (change this to suit your needs)
set "source_dir=.\test"

:: Move to source dir and process each folder, one at a time.
pushd "%source_dir%"

for /f "delims=" %%A in ('dir /b /a:d') do (
    call :getSeriesName "%%A" series_name

    mkdir !series_name! 2>nul

    REM If you want to do additional cleanup, change the copy to a move
    copy "%%A\*.pdf" !series_name! >nul
)

popd
exit /b

::------------------------------------------------------------------------------
:: Extracts the series name from the directory and changes spaces to periods
:: 
:: Arguments: %1 - The original book release name
::            %2 - The variable that will contain the returned value because
::                 batch doesn't actually have functions
:: Returns:   The series name and volume number
::------------------------------------------------------------------------------
:getSeriesName
:: Convert spaces to periods
set "raw_name=%~1"
set standard_name=!raw_name: =.!

:: Split the folder name into period-delimited tokens
set token_counter=0
:name_split
for /f "tokens=1,* delims=.-" %%B in ("!standard_name!") do (
    set name_part[!token_counter!]=%%B
    set standard_name=%%C
    set /a token_counter+=1
    goto :name_split
)

:: Get the volume number
for /l %%B in (0,1,!token_counter!) do (
    echo !name_part[%%B]!|findstr /R /C:"[sS][0-9][0-9]*[eE][0-9][0-9]*" >nul
    if !errorlevel! equ 0 (
        set /a name_end=%%B-1
        set volume_value=!name_part[%%B]!
        set volume_value=!volume_value:~0,3!
    )
)

:: Rebuild the series name
set "extracted_name="
for /l %%B in (0,1,!name_end!) do set "extracted_name=!extracted_name!!name_part[%%B]!."
set extracted_name=!extracted_name!!volume_value!

:: Purge the name_part array
for /l %%B in (0,1,!token_counter!) do set "name_part[%%B]="

:: Return the extracted name
set "%~2=!extracted_name!"

Upvotes: 1

Related Questions