Reputation: 247
I have to check a tree for duplicating files and write all of them to List.txt file. But my script seems to skip one of the file locations in each group. (For example, if there are 4 duplicating files, only 3 of them appear in the list.)
If I'm not mistaken, it's the location of the "previousFile" of the last comparison that is missing. How do I write it to the list, too?
Also, how can I group paths in the List.txt by the filename so that it looks something like this:
File fileNameA.txt :
C:\path1\fileNameA.txt
C:\path2\fileNameA.txt
C:\path3\fileNameA.txt
File fileNameB.txt :
C:\path1\fileNameB.txt
C:\path2\fileNameB.txt
C:\path3\fileNameB.txt
C:\path4\fileNameB.txt
File fileNameC.txt :
C:\path1\fileNameC.txt
C:\path2\fileNameC.txt
...
?
That's my script so far:
@echo off
setlocal disableDelayedExpansion
set root=%1
IF EXIST List.txt del /F List.txt
set "prevTest=none"
set "prevFile=none"
for /f "tokens=1-3 delims=:" %%A in (
'"(for /r "%root%" %%F in (*) do @echo %%~zF:%%~fF:)|sort"'
) do (
set "currentTest=%%A"
set "currentFile=%%B:%%C"
setlocal enableDelayedExpansion
set "match="
if !currentTest! equ !previousTest! fc /b "!previousFile!" "!currentFile!" >nul && set match=1
if defined match (
echo File "!currentFile!" >> List.txt
endlocal
) else (
endlocal
set "previousTest=%%A"
set "previousFile=%%B:%%C"
)
)
Upvotes: 2
Views: 1323
Reputation: 30153
You need to count matches and add echo
previous filename to echo
current one in case of the first match.
Note '"(for /r "%root%" %%F in (*) do @echo(%%~nxF?%%~zF?%%~fF?)|sort"'
changes:
?
(question mark) as a delimiter: reserved character by Naming Files, Paths, and Namespaces%%~nxF?
prefix to sort
output properly by file names even in my sloppy test folder structure, see sample output below.This output shows than even cmd
poisonous characters (like &
, %
, !
etc.) in file names are handled properly with DisableDelayedExpansion
kept.
@ECHO OFF
SETLOCAL EnableExtensions DisableDelayedExpansion
set "root=%~1"
if not defined root set "root=%CD%"
set "previousTest="
set "previousFile="
set "previousName="
set "match=0"
for /f "tokens=1-3 delims=?" %%A in (
'"(for /r "%root%" %%F in (*) do @echo(%%~nxF?%%~zF?%%~fF?x)|sort"'
) do (
set "currentName=%%A"
set "currentTest=%%B"
set "currentFile=%%C"
Call :CompareFiles
)
ENDLOCAL
goto :eof
:CompareFiles
if /I "%currentName%" equ "%previousName%" ( set /A "match+=1" ) else ( set "match=0" )
if %match% GEQ 1 (
if %match% EQU 1 echo FILE "%previousFile%" %previousTest%
echo "%currentFile%" %currentTest%
) else (
set "previousName=%currentName%"
set "previousTest=%currentTest%"
set "previousFile=%currentFile%"
)
goto :eof
Above script lists all files of duplicated names regardless of their size and content. Sample output:
FILE "d:\bat\cliPars\cliParser.bat" 1078
"d:\bat\files\cliparser.bat" 12303
"d:\bat\Unusual Names\cliparser.bat" 12405
"d:\bat\cliparser.bat" 335
FILE "d:\bat\Stack33721424\BÄaá^ cčD%OS%Ď%%OS%%(%1!)&°~%%G!^%~2.foo~bar.txt" 120
"d:\bat\Unusual Names\BÄaá^ cčD%OS%Ď%%OS%%(%1!)&°~%%G!^%~2.foo~bar.txt" 120
To list all files of duplicated names with the same size but regardless of their content:
:CompareFiles
REM if /I "%currentName%" equ "%previousName%" (
if /I "%currentTest%%currentName%" equ "%previousTest%%previousName%" (
set /A "match+=1"
REM fc /b "%previousFile%" "%currentFile%" >nul && set /A "match+=1"
) else ( set "match=0" )
To list all files of duplicated names with the same size and binary content:
:CompareFiles
REM if /I "%currentName%" equ "%previousName%" (
if /I "%currentTest%%currentName%" equ "%previousTest%%previousName%" (
REM set /A "match+=1"
fc /b "%previousFile%" "%currentFile%" >nul && set /A "match+=1"
) else ( set "match=0" )
Edit If the name of the file doesn't matter (only its contents), you could apply next changes in FOR
loop and in :CompareFiles
subroutine:
@ECHO OFF
SETLOCAL EnableExtensions DisableDelayedExpansion
set "root=%~1"
if not defined root set "root=%CD%"
set "previousTest="
set "previousFile="
set "match=0"
for /f "tokens=1-2 delims=?" %%A in (
'"(for /r "%root%" %%F in (*) do @echo(%%~zF?%%~fF?)|sort"'
) do (
set "currentTest=%%A"
set "currentFile=%%B"
rem optional: skip all files of zero length
if %%A GTR 0 Call :CompareFiles
)
ENDLOCAL
goto :eof
:CompareFiles
if /I "%currentTest%" equ "%previousTest%" (
fc /b "%previousFile%" "%currentFile%" >nul && set /A "match+=1"
) else ( set "match=0" )
if %match% GEQ 1 (
if %match% EQU 1 echo FILE "%previousFile%" %previousTest%
echo "%currentFile%" %currentTest%
) else (
set "previousTest=%currentTest%"
set "previousFile=%currentFile%"
)
goto :eof
Upvotes: 2