Sledge
Sledge

Reputation: 1345

Linux script to automatically convert file type to UTF8

I am in a tight spot and could use some help coming up with a linux shell script to convert a directory full of pipes delimited files from their original file encoding to UTF-8. The source files are either US-ASCII or ISO-8859-1 file encoding. The closest thing that I could come up with is:

iconv -f ISO8859-1 -t utf-8 * > name_of_utf8_file

This condenses all of the files into a single file which is not needed but OK for this application. The problem is that I neeed to specify both the source and destination file encoding, so for half of the files I don't know what it does. Is there way to write a shell script using commands like file -i or the like.

Any advice here is much appreciated.

Upvotes: 2

Views: 2154

Answers (1)

tink
tink

Reputation: 15206

This is, (not properly tested, caveat emptor :)), one way of doing it:

Maybe try w/ a small subset first - this is more of a thought example than a turn-key solution.

for i in *
do 
  if $( file -i "${i}"|grep -q us-ascii ); then 
    iconv -f us-ascii -t utf-8 "$i" > "${i}.utf8"
  fi 
  if $( file -i "${i}"|grep -q iso-8859-1 ); then 
    iconv -f iso8859-1 -t utf-8 "$i" > "${i}.utf8"
  fi 
done

Upvotes: 4

Related Questions