JeremyC
JeremyC

Reputation: 13

Bash To Convert PDF Files In Multiple Subdirectories

I am attempting to convert PDF files in 2,432 subdirectories (one PDF file per folder) to HTML files.

For example, I have tried a few variations of

find . -type d | while read d; for file in *.pdf; do pdftohtml -c -i -s "$file"; done

and

for f in ./*/*.pdf; do pdftohtml -c -i -s "$file"; done

without any success. I have also tried some others, however, I just can't get anything to work this time.

I know that part of the code works because I can put multiple PDF files in one folder and use

for file in *.pdf; do pdftohtml -c -i -s "$file"; done

to recursively convert all of the files in that folder to HTML.

Is there a way that I can search through each folder and convert each file with a bash script? Or is this something I will have to do one folder at a time?

Upvotes: 1

Views: 499

Answers (3)

Incrivel Monstro Verde
Incrivel Monstro Verde

Reputation: 948

use:

find . -name \*.pdf -exec pdftohtml -c -i -s {} \;

Upvotes: 0

Socowi
Socowi

Reputation: 27185

Your second command seemed about right. There was just one little error

for f in ./*/*.pdf; do pdftohtml -c -i -s "$file"; done

You wrote for f but used $file. Try

for f in ./*/*.pdf; do pdftohtml -c -i -s "$f"; done

Upvotes: 0

oliv
oliv

Reputation: 13239

You can use the find command with the option -exec to trigger the conversion:

find /path/to/your/root/pdf/folder -type f -name "*.pdf" -exec bash -c 'pdftohtml -c -i -s "$1"' _ {} \;

The pdftohtml is executed for every pdf file found. Note that {} represents the pdf file.

Upvotes: 1

Related Questions