Reputation: 1345
I am in a tight spot and could use some help coming up with a linux shell script to convert a directory full of pipes delimited files from their original file encoding to UTF-8. The source files are either US-ASCII or ISO-8859-1 file encoding. The closest thing that I could come up with is:
iconv -f ISO8859-1 -t utf-8 * > name_of_utf8_file
This condenses all of the files into a single file which is not needed but OK for this application. The problem is that I neeed to specify both the source and destination file encoding, so for half of the files I don't know what it does. Is there way to write a shell script using commands like file -i
or the like.
Any advice here is much appreciated.
Upvotes: 2
Views: 2154
Reputation: 15206
This is, (not properly tested, caveat emptor :)), one way of doing it:
Maybe try w/ a small subset first - this is more of a thought example than a turn-key solution.
for i in *
do
if $( file -i "${i}"|grep -q us-ascii ); then
iconv -f us-ascii -t utf-8 "$i" > "${i}.utf8"
fi
if $( file -i "${i}"|grep -q iso-8859-1 ); then
iconv -f iso8859-1 -t utf-8 "$i" > "${i}.utf8"
fi
done
Upvotes: 4