Dónal
Dónal

Reputation: 187399

change file encoding

I have a problem with character encoding in some HTML pages. It seems that the cause of the problem is that some of the .html files are not saved as UTF-8 encoded files. Even though I have instructed Eclipse to save these files as UTF-8, when I open them in a browser, it indicates that the files are ISO-8859-1.

How can I change the encoding of these files to UTF-8?

UPDATE: I already have the following included in the section of each webpage

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

I am using the Apache web server.

Thanks, Donal

Upvotes: 6

Views: 12905

Answers (6)

Tamas Czinege
Tamas Czinege

Reputation: 121444

The problem with UTF-8 is that there is no magic byte sequence at the beginning of these files - the browser's only chance to detect UTF-8 is either by the XML declaration, HTML meta tags, or some heuristics as fallback.

Make sure that there is either an XML encoding declaration or some HTML meta tags in the header of the HTML.

<?xml version="1.0" encoding="utf-8"?>

just below DOCTYPE if it's XHTML, or

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

in the head section.

Upvotes: 4

Marco Lackovic
Marco Lackovic

Reputation: 6517

In Eclipse 3.7, go to:

Windows > Preferences > General > Workspace

Under "Text file encoding" set the file encoding you need.

Upvotes: 1

Akrikos
Akrikos

Reputation: 3662

You may need to change the content type header that your web server sends the client.

Edit: While this did work for this particular situation, using a tool to change the file encoding as suggested by other posters may be a better solution in other situations. YMMV.


Instructions for saving as UTF-8 in Eclipse (which I realize you already have):

You should probably change the Default Encoding in your workspace for the HTML document.

This is for Eclipse 3.4. If you have a different version, this may be slightly different.

Goto Window->Preferences
In the Preferences window goto General->Content Types
At this point, you can specify a 'Default Encoding' for files near the bottom of the preferences window. Expand 'Text' and select HTML. In the 'Default Encoding' entry, put UTF-8. Then click 'update' at the right.

After this, all HTML files should be saved in UTF-8 format.

Upvotes: 7

Aaron Novstrup
Aaron Novstrup

Reputation: 21017

As far as I know, setting the character encoding in Eclipse does not actually convert the files -- it just tells Eclipse how you want them interpreted. Your best bet is to use a converter tool such as the one Adam suggested.

Upvotes: 0

Aaron Novstrup
Aaron Novstrup

Reputation: 21017

Try adding

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

to the head section of your html files, or ensure that your server is serving the files with a Content-Type http header. Without either of these, the browser can only guess at the character encoding.

Upvotes: 1

Adam Rosenfield
Adam Rosenfield

Reputation: 400682

You can use iconv to convert files from one character encoding to another.

Upvotes: 3

Related Questions