user1769925
user1769925

Reputation: 598

Change extension of files in windows batch

I am trying to rename a lot of files. I only want to change the extention from ".pdf.OCR.pdf" to ".pdf" So far I got the following code

rem for /r myPDFfolder %%i in (*.pdf.OCR.pdf) do ren "%%i" "%%~ni.pdf"

But it does not appear to work with the extension that has multiple dots -- what am I doing wrong?

Upvotes: 1

Views: 3272

Answers (3)

aschipfl
aschipfl

Reputation: 34899

Alternative solution, without delayed expansion (remove ECHO to actually rename any files):

@echo off
rem iterate over all matching files:
for /F "delims=" %%A in (
  'dir /S /B /A:-D "myPDFfolder\*.pdf.OCR.pdf"'
) do (
  rem "%%~nA" removes last ".pdf"
  for /F %%B in ("%%~nA") do (
    rem "%%~nB" removes ".OCR" part
    for /F %%C in ("%%~nB") do (
      rem "%%~nC" removes remaining ".pdf"
      ECHO ren "%%~fA" "%%~nC.pdf"
    ) & rem next %%C
  ) & rem next %%B
) & rem next %%A

NOTE: The directory tree is enumerated before for iterates through it because otherwise, some items might be skipped or tried to be renamed twice (see this post concerning that issue).

Upvotes: 0

dbenham
dbenham

Reputation: 130819

There is no need for a batch file. A moderate length one liner from the command prompt can do the trick.

If you know for a fact that all files that match *.pdf.ocr.pdf have this exact case: .pdf.OCR.pdf, then you can use the following from the command line:

for /r "myPDFfolder" %F in (.) do @ren "%F\*.pdf.ocr.pdf" *O&ren "%F\*.pdf.o" *f

The first rename removes the trailing .pdf, and the second removes the .OCR. The above works because *O in the target mask preserves everything in the original file name through the last occurrence of upper-case O, and *f preserves through the last occurrence of lower-case f. Note that the characters in the source mask are not case sensitive. You can read more about how this works at How does the Windows RENAME command interpret wildcards?

If the case of .pdf.ocr.pdf can vary, then the above will fail miserably. But there is still a one liner that works from the command line:

for /r "myPDFfolder" %F in (*.pdf.ocr.pdf) do @for %G in ("%~nF") do @ren "%F" "%~nG"

%~nF lops off the last .pdf, and %~nG lops off the .OCR, which leaves the desired extension of .pdf.

You should not have to worry about a file being renamed twice because the result after the rename will not match *.pdf.ocr.pdf unless the original file looked like *.pdf.ocr.pdf.ocr.pdf.

If you think you might want to frequently rename files with complex patterns in the future, then you should look into JREN.BAT - a regular expression renaming utility. It is pure script (hybrid JScript/batch) that runs natively on any Windows machine from XP onward. Full documentation is embedded within the script.

Assuming JREPL.BAT is in a folder that is listed within your PATH, then the following simple command will work from the command line, only renaming files that match the case in the search string:

jren "(\.pdf)\.OCR\.pdf$" $1 /s /p "myPDFfolder"

If you want to ignore case when matching, but want to force the extension to be lower case, then:

jren "\.pdf\.ocr\.pdf$" ".pdf" /i /s /p "myPDFfolder"

Upvotes: 1

woxxom
woxxom

Reputation: 73506

Extension is the part of file name after the last dot.

Use string replacement to strip the unneeded part:

setlocal enableDelayedExpansion
for /f "eol=* delims=" %%i in ('dir /s /b "r:\*.pdf.OCR.pdf"') do (
    set "name=%%~nxi"
    ren "%%i" "!name:.pdf.OCR=!"
)

P.S. Parsing of dir is used to make the code more robust in case a different text is stripped which might have changed the sorting order and cause for to process the file twice or more times.

Upvotes: 4

Related Questions