Reputation: 1
I am building a newsletter builder in PHP and one of my requirements is that once the email has been composed in HTML it is checked to see if it meets the w3 standards and a notice is thrown to the end user if any invalid errors are found from the validation run.
At of the moment I am using the w3validator API via a PHP Curl request following this: https://github.com/validator/validator/wiki/Service:-Input:-POST-body
My problem is that I can't seem to get the validator to process the html content using the XHTML1 doctype. By default, it expects to see the HTML5 doctype, and although there is the ability to set a query string parameter ('parser'), it seems the minimum version I am able to test is HTML4.
I have also tried leaving the 'parser' parameter both blank and with the value 'html' which should have made the validator use the doctype set in the html content for its validation, but this doesn't work either.
Is it possible to use the w3standards api to valid XHTML1? And if not is there an alternative API that would allow for us to do so?
Upvotes: 0
Views: 189
Reputation: 88235
Maintainer of the W3C HTML checker (validator) here.
To check documents against the XHTML1 schema, you need to send:
schema
query param with value http://s.validator.nu/xhtml10/xhtml-strict.rnc
Content-Type
header with value application/xhtml+xml; charset=utf-8
For example, using curl
to send a request, it would look like this:
curl -H "Content-Type: application/xhtml+xml; charset=utf-8" \
--data-binary @FILE.xhtml \
'https://validator.w3.org/nu/?schema=http://s.validator.nu/xhtml10/xhtml-strict.rnc&out=json'
…where FILE.xhtml
is replaced with whatever the name is of the actual file you want to check, and the out=json
query param specifies that you want JSON-formatted results from the checker. (Use out=xml
if you want XML-formatted results, or out=gnu
for results in the GNU error format.)
http://s.validator.nu/xhtml10/xhtml-strict.rnc
is just an identifier the checker recognizes internally for the XHTML 1.0 Strict schema. There’s no actual schema on the Web at that URL.
The list of such identifiers that the checker recognizes is in the following file:
https://github.com/validator/validator/blob/master/resources/presets.txt
Note that you can include some additional checks by adding other identifiers to the schema
value:
curl -H "Content-Type: application/xhtml+xml; charset=utf-8" \
--data-binary @FILE.xhtml \
'https://validator.w3.org/nu/?schema=http://s.validator.nu/xhtml10/xhtml-strict.rnc%20http://s.validator.nu/html4/assertions.sch%20http://c.validator.nu/all-html4/&out=json'
The schema identifiers must be separated by %20
(percent-encoded space character).
Upvotes: 2