Reputation: 175
I'm trying to clean some htmls. I have converted them to xhtml with tidy
$ tidy -asxml -i -w 150 -o o.xml index.html
The resulting xhtml ends up having named entities. When trying xsltproc on those xhtmls, I keep getting errors.
$ xsltproc --novalid -o out.htm t.xsl o.xml
o.xml:873: parser error : Entity 'mdash' not defined
resources to storing data and using permissions — as needed.</
^
o.xml:914: parser error : Entity 'uarr' not defined
</div><a href="index.html#top" style="float:right">↑ Go to top</a>
^
o.xml:924: parser error : Entity 'nbsp' not defined
Android 3.2 r1 - 27 Jul 2011 12:18
If I add --html to the xsltproc it complains on a tag that has name and id attributes with same name (which is valid)
$ xsltproc --novalid --html -o out.htm t.xsl o.xml o.xml:845: element a: validity error : ID top already defined
<a name="top" id="top"></a>
^
The xslt is simple:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="//*[@id=side-nav]"/>
</xsl:stylesheet>
Why doesn't --html work? Why is it complaining? Or should I forget it and fix the entities?
Upvotes: 3
Views: 4641
Reputation: 175
I did the other way - made tidy produce numeric entities rather then named with -n option.
$ tidy -asxml -i -n -w 150 -o o.xml index.xml
Now I can remove --html option and it works. Although I can remove that name attribute, but still wonder why it is reported as an error, although it is valid
Upvotes: 1
Reputation: 51002
I am assuming that the unclearly stated question is this: I know how to avoid "Entity 'XXX' not defined" errors when running xsltproc (add --html
). But how do I get rid of "ID YYY already defined"?
Recent builds of Tidy have an anchor-as-name option. You can set it to "no" to remove unwanted name
attributes:
This option controls the deletion or addition of the name attribute in elements where it can serve as anchor. If set to "yes", a name attribute, if not already existing, is added along an existing id attribute if the DTD allows it. If set to "no", any existing name attribute is removed if an id attribute exists or has been added.
Upvotes: 0