Stanley
Stanley

Reputation: 569

xml parsing with "&", "®", but still getting errors

Everywhere I look, posts are telling me to escape xml special characters with their html entity, but I'm still getting XML parsing errors. The error message I'm receiving is "unidentified entity", and it occurs at the &amp ; and &reg ; marks (without the spaces). How can I fix this and why would this still be throwing errors?

<?xml version="1.0" encoding="UTF-8"?>
<maps>
    <location id="tx">
        <item label="Lobby &amp; Entrance" xpos="125" ypos="112" />
        <item label="Restaurant &amp; Bar" xpos="186" ypos="59" />
        <item label="Swimming Pool" xpos="183" ypos="189" />
        <item label="Nautilus Gym&reg;" xpos="154" ypos="120" />
    </location>
</maps>

Upvotes: 4

Views: 14890

Answers (4)

Quentin
Quentin

Reputation: 944320

Everywhere I look, posts are telling me to escape xml special characters with their html entity

Don't. Use XML entities.

The error message I'm receiving is "unidentified entity", and it occurs at the &amp; and &reg; marks.

You shouldn't get a problem with &amp; as that is part of XML. You must be using a broken parser. It is hard to tell though as you have not provided any of the code you are using to parse this.

&reg; on the other hand should not be parsed by an XML parser unless you include a DTD that defines it. Use numeric entities or (better yet) the real character and a suitable (UTF-8) character encoding.

Upvotes: 1

fmsf
fmsf

Reputation: 37177

Replace: &reg; by: &#174; and &amp; by: &#38;

and your XML will be valid

Upvotes: 13

don_jones
don_jones

Reputation: 852

XML only defines the entities &amp;, &lt; und &gt;. &reg; is invalid unless you declare in some way.

Upvotes: 1

Laurence Gonsalves
Laurence Gonsalves

Reputation: 143314

XML only has a small number of "built-in" character entity names. "amp" is one of the built-ins, so it seems unlikely that you're getting an error there. "reg" is not built-in, however.

To fix this you can either use a numeric reference on place of reg, use the actual character, or include an entity declaration for reg, like this:

<!ENTITY reg "&#174;">

You can look in the XHTML DTDs to get the complete set of entity declarations for HTML entities.

Upvotes: 6

Related Questions