user3520363
user3520363

Reputation: 380

Rename pdf files from recursive numeric strings into original name from same .html (http link) or .txt file

I download some pdf files from HERE

Pdf files are downloaded not using original filenames but by number strings like

1610.00005
1610.00022

Fortunally in this HTTP link page or txt files (if I copy for offline renaming) I have relative

numeric -> original text filename

string corrispondence
For example when I download this files

- A Note on Time Operators in Relativistic Quantum Mechanics
- A Stronger Theorem Against Macro-realism
- Determining quantum correlations in bipartite systems - from qubit to qutrit and beyond
- Pair entanglement in dimerized spin-s chains

Files are downloaded with this filenames

1610.00005.pdf
1610.00022.pdf
1610.00041.pdf
1610.00056.pdf

BUT I want rename into original filesname not in a number string I'd like to set a http link or text file for path

I have only this codes

(powershell)

$names = Get-Content c\myfiles
Get-ChildItem C:\somedir\*.pdf | Sort -desc | 
    Foreach {$i=0} {Rename-Item $_ ($_.basename + $names[$i++] + $_.extension) -WhatIf}

or batch code

@echo off
setlocal EnableDelayedExpansion

rem Load the list of authors:
set i=0
for /F %%a in (myfiles.txt) do (
   set /A i+=1
   set "author[!i!]=%%a"
)

rem Do the rename:
set i=0
for /F %%a in ('dir /b *.pdf') do (
   set /A i+=1
   for %%i in (!i!) do ren "%%a" "%%~Na!author[%%i]!%%~Xa"
)

Upvotes: 1

Views: 195

Answers (1)

TessellatingHeckler
TessellatingHeckler

Reputation: 29048

#All PDFs | Rename { query Arxiv for the abstract by filename, use the page title + ".pdf"}

Get-ChildItem *.pdf | Rename-Item -NewName { 
    $title = (Invoke-WebRequest "https://arxiv.org/abs/$($_.BaseName)").parsedhtml.title
    $title = $title -replace '[\\/:\*\?"<>\|]', '-'       # replace forbidden characters
    "$title.pdf"                                          # in filenames with -
}

You might want to put a -whatif on the end first, to see what it would do, in case it ruins all the filenames. Or take a backup copy of the folder.

Edit: One of the titles is "Signatures of bifurcation on quantum correlations: Case of quantum kicked top" and the : is not allowed in a filename. Script edited to replace all forbidden characters in Windows filenames with dashes instead.

Upvotes: 1

Related Questions