Nicolas FRANCOIS
Nicolas FRANCOIS

Reputation: 185

Bash script for encoding conversion

I'd like to convert all my work on a LaTeX book from ISO8859-1 to UTF-8.

I found a script that I would like to adapt. So here's what I wrote :

# start encoding
encodeFrom='ISO-8859-1'
# target encoding
encodeTo='UTF-8'
# finding files whose extensions correspond to the given parameter
for filename in ` find . -type f -name *.{$1}`
do    
    echo $filename
    # saving source file
    mv $filename $filename.save
    # convert file
    iconv -f $encodeFrom -t $encodeTo $filename.save -o $filename
    # check that file is in unix mode
    dos2unix $filename
done

Problem is : the 'find' command doesn't work : testing it with "set -v", I get :

find . -type f -name *.{$1}

What did I do wrong ? I tried to find my way threw a bash book (as a matter of fact, two :-), but couldn't find a solution.

Even trying for just tex files conversions :

for filename in ` find . -type f -name *.tex`

I get :

find: les chemins doivent précéder l'expression : arithmetique.tex

Upvotes: 1

Views: 851

Answers (2)

Inian
Inian

Reputation: 85530

See ParsingLs, why you should NOT parse output of find or ls in a for loop.

Use a proper process-substitution syntax with a while-loop and ALWAYS double-quote your variables.

while IFS= read -r filename
do
    echo "$filename"
    mv "$filename" "${filename}".save
    iconv -f "$encodeFrom" -t "$encodeTo" "${filename}".save -o "$filename"
    dos2unix "$filename"
done < <(find . -type f -name "*.{$1}")

As suggests below recommend using -print0 option in find in case if your filenames have a new line character, whitespace or any other special characters,

-print0
       True; print the full file name on the standard output, followed by a null character 
       (instead of the  newline  character  that  -print  uses). This allows  file  names  
       that contain newlines or other types of white space to be correctly interpreted by 
       programs that process the find output.

Using it in find

while IFS= read -r -d'' filename
do
    echo "$filename"
    mv "$filename" "${filename}".save
    iconv -f "$encodeFrom" -t "$encodeTo" "${filename}".save -o "$filename"
    dos2unix "$filename"
done < <(find . -type f -name "*.{$1}" -print0)

Using it appends the \0 between filenames and the read separates filenames with the same character, ensures your find results are intact.

Upvotes: 2

Nicolas FRANCOIS
Nicolas FRANCOIS

Reputation: 185

OK, I found an answer : just "protect" the argument '.$1' like this : ".$1".

All I had to do was ask :-)

\bye

Upvotes: 0

Related Questions