Reputation: 11
I'm pulling a bunch of data from one database and feeding it into an application via XML.
So I start with
re_objects_xml = Document.new
re_objects_xml.context[:attribute_quote] = :quote
re_objects_xml.context[:raw] = 'true'
re_objects_xml.add_element("object-collection")
base_object_collection = re_objects_xml.elements[1]
timeline_meta = Element.new("Metadata")
timeline_meta.add_attribute("id", "#{re_meta_id}")
an then I have the following variables:
k = "Comments"
v = "We're pretty good"
and I do
timeline_meta.add_attribute("#{k}","#{v}")
And then add timeline_meta to base_object_collection
base_object_collection << timeline_meta
I end up with XML that contains this:
...Comments="GRUBB: We're pretty good...
I'm trying to get
...Comments="GRUBB: We're pretty good...
Can anyone help me see what I'm missing or a better way to do this?
Upvotes: 1
Views: 765
Reputation: 832
I know this question is very old but I just came across the same issue and my findings might help people that are still forced to work with Ruby 1.8.6.
The thing is the implementation of REXML is very dependant on Ruby version, in fact the implementation differs a lot between different patches of Ruby 1.8.6 for example.
The context flag that should stop REXML from escaping entities is :raw
but the fact that it's not working in your case could mean that REXML doesn't understand the flag or the value that you're setting it to.
If you're using a Ruby version earlier than 1.8.6-p110 then you're out of luck. This version doesn't support context flags like :attribute_quote
or :raw
. So your only options are to either
Upgrade to a later version of Ruby, 1.8.6-p110 and up.
Or post-process the raw XML replacing escaped entities. This should work since REXML will convert
& to &
and& to &amp;
If you're using the later version of Ruby then context[:raw]
has to be set to :all
or a list of names to process in raw mode. The context can also be passed into the Document
constructor like so Document.new(nil, {:raw => :all, :attribute_quote => :quote})
Upvotes: 0
Reputation: 160581
Why are you worrying about a single-quote/apostrophe being converted into the entity? The XML parser/engine does that to help preserve what could be an ambiguous/colliding delimiting character. From the XML spec about Character Data and Markup:
To allow attribute values to contain both single and double quotes, the
apostrophe or single-quote character (') may be represented as " ' ", and
the double-quote character (") as " " ".
Because we can delimit the content for the Comments
parameter using either '
or "
, the spec allows for encoding the embedded single and double quotes as entities, avoiding collisions.
When the XML is parsed on the receiving side, it should decode that entity back into the correct character, or have some function/method that makes it easy. You don't specify what DBM you're using but it should be able to help out, but that's a separate question.
As a stylistic thing in your code:
timeline_meta.add_attribute("#{k}","#{v}")
is wrong. You're redundantly converting strings into strings. Use:
timeline_meta.add_attribute(k, v)
instead.
Upvotes: 1