Reputation: 856
I am trying to run the following to extract the text from all the pdfs
find *.pdf | awk '{system("pdftotext "$0)}'
but dang it some crazy person put spaces in file names, how can I deal with this smoothly?
Upvotes: 3
Views: 1110
Reputation: 46856
What is awk's role in this? Perhaps you should let find
execute things itself.
find . -name \*.pdf -exec /path/to/pdftotext {} \;
Or if you're really really stuck with assuming that filenames will be safe as stdout to find (which you've proven they are not simply by asking this question), then put the filenames in quotes. This will work:
find . -name \*.pdf -print | awk '{cmd=sprintf("pdftotext \"%s\"", $0);system(cmd);}'
Upvotes: 2