Svish
Svish

Reputation: 157971

Converting webpages from UTF-8 to ISO-8859-1 in linux

Anyone have a neat trick on how to convert a number of php and html files from UTF-8 to ISO-8859-1 in linux (Ubuntu)?

Upvotes: 10

Views: 16866

Answers (2)

HMM
HMM

Reputation: 3013

Ubuntu has recode

$ sudo apt-get install recode
$ recode UTF-8..latin1 *.php

Recursively, thanks to Ted Dziuba:

$ find . -name "*.php" -exec recode UTF-8..latin1 {} \;

Upvotes: 20

Varkhan
Varkhan

Reputation: 16751

I think iconv is your answer...

Form man iconv:

  NAME
      iconv - Convert encoding of given files from one encoding to another

  SYNOPSIS
      iconv -f encoding -t encoding inputfile

  DESCRIPTION
      The iconv program converts the encoding of characters in inputfile from one coded 
      character set to another. The result is written to standard output unless otherwise 
      specified by the --output option.

      .....

So you could probably do a

find $my_base_dir -name "*.php" -o -name "*.html" -exec sh -c "( \
   iconv -t ISO88592 -f UTF8 {} -o {}.iconv ; \
   mv {}.iconv {} ; \
)" \;

This will recursively find the appropriately named files and re-encode them (the temporary file is necessary, as iconv will truncate output before starting to work).

Upvotes: 10

Related Questions