Brett Ryan
Brett Ryan

Reputation: 28295

Why does my HTML validator keep reporting a different encoding to that of the page?

I am using a local validator.nu instance to validate a site, however it keeps telling me the encoding does not match:

Internal encoding declaration “iso-8859-1” disagrees with the actual encoding of the document (“utf-8”).

I've done everything to try and get the encoding to be forced to iso-8859-1 as we are using a legacy DB that requires this encoding.

  1. Process that starts forces LANG='iso-8859-1'
  2. Forcing file.encoding on tomcat startup -Dfile.encoding=iso-8859-1, this is confirmed by checking Charset.defaultCharset() which reports ISO-8859-1.
  3. Maven project resources are copied with iso-8859-1: <project.build.sourceEncoding>iso-8859-1</project.build.sourceEncoding>
  4. JSP page directive specifies encoding: <%@page contentType="text/html; charset=ISO-8859-1" pageEncoding="ISO-8859-1" %>
  5. Content-Type has been set in page head: <meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
  6. Tomcat URIEncoding set: <Connector port="80" protocol="HTTP/1.1" connectionTimeout="20000" URIEncoding="iso-8859-1" redirectPort="8443" />

What else could I have missed that's causing the page to come back as utf-8?

Interestingly it is rendering characters like © correctly, and if © is placed in a text input it is saved to the DB correctly using the 8859-1 codepage.

UPDATE: I've just decided to download a page from the server with cURL and upload to the w3 checker which validated successfully. The only issue it had was the naming of iso-8859-1 should be windows-1252 though I thought those two character sets were slightly different, this w3 mailing-list entry says otherwise though, I need to look into that.

This is looking more and more like a bug in validator.nu which I will also look into.

Upvotes: 3

Views: 2293

Answers (2)

Brett Ryan
Brett Ryan

Reputation: 28295

I've found the problem!

The document is fine, the server is fine, the validator - actually validates fine. It's the firefox plugin that's changing the page encoding before sending to the validator and giving me a false error.

I have come to this conclusion from help on the [email protected] mailing list and changing from the Fx html5validator addon to the Fx web developer addon which now validates my documents correctly. Using the local validator instance now validates fine.

I've this issue with the original firefox plugin.

Upvotes: 1

maksim_khokhlov
maksim_khokhlov

Reputation: 804

Try adding a filter (instance of javax.servlet.Filter declared with <filter> and <filter-mapping> tags in web.xml) that will set the desired character encoding on ServletRequest and ServletResponse instances coming into the doFilter() method as parameters.

See javadoc here and here.

Upvotes: 0

Related Questions