Reputation: 649
I am trying to create an XML file from an array. This is my builder code:
def buildXML(formattedText)
builder = Nokogiri::XML::Builder.new do |xml|
xml.products {
formattedText.each do |lineItem|
xml.item {
xml.articleNumber lineItem[0]
description = lineItem[1..(findIndexOnShtrih(lineItem)-1)].join(" ").force_encoding(Encoding::Windows_1251)
xml.description description
xml.shtrihCode lineItem.at(findIndexOnShtrih(lineItem))
}
end
}
end
end
My input looks like this (it contains always an article number on 1st index, then there goes the description from 2nd to N-3 index, N-2 till N-1 is amount and Nth index contains the bar code):
["047609", "СОК", "СВЕЖЕВЫЖАТЫЙ", "ТОМАТ", "200", "МЛ", "(фреш", "дня)", "1", "шт", "2400000032731"]["048504", "ВОДА", "ГАЗИРОВАННАЯ", "С", "НАТУРАЛЬНЫМ", "СИРОПОМ", "(200МЛ)", "1", "шт", "2400000032953"]["055794", "СОК", "СВЕЖЕВЫЖАТЫЙ", "В", "АССОРТИМЕНТЕ", "(200МЛ)", "1", "шт", "2400000036425"]["058270", "СОК", "СВЕЖЕВЫЖАТЫЙ", "КЛУБНИКА", "+ЯБЛОКО", "200", "МЛ", "(фреш", "дня)", "1", "шт", "2400000037149"]
This leads to stuff like this:
<articleNumber>055794</articleNumber>
<description>СОК СВЕЖЕВЫЖАТЫЙ В АССОРТИМЕНТЕ (200МЛ) 1 шт</description>
<shtrihCode>2400000036425</shtrihCode>
</item>
<item>
<articleNumber>058270</articleNumber>
<description>СОК СВЕЖЕВЫЖАТЫЙ КЛУБНИКА +ЯБЛОКО 200 МЛ (фреш дня) 1 шт</description>
<shtrihCode>2400000037149</shtrihCode>
</item>
</products>
Basically, I want the description in the XML to show proper cyrillic letters.
Can I somehow force the builder to use specific encoding? I've found a lot of material on how to open XML files with certain encoding, using Nokogiri::XML(a, nil, "UTF-8")
for example, but nothing on how to build a valid XML.
Surprisingly enough if I omit the code block on my text, SO displays my text just fine.
Upvotes: 0
Views: 569
Reputation: 649
After hours of trying found this post - How do I encode/decode HTML entities in Ruby?
You need do decode such values as С
according to this table:
http://webdesign.about.com/od/localization/l/blhtmlcodes-ru.htm
CGI didn't help me, but HTMLEntities did.
This is my working code right now:
require 'htmlentities'
puts HTMLEntities.new.decode(buildXML(cleansedArray).to_xml)
And finally the desired output:
<item>
<articleNumber>055794</articleNumber>
<description>СОК СВЕЖЕВЫЖАТЫЙ В АССОРТИМЕНТЕ (200МЛ) 1 шт</description>
<shtrihCode>2400000036425</shtrihCode>
</item>
<item>
<articleNumber>058270</articleNumber>
<description>СОК СВЕЖЕВЫЖАТЫЙ КЛУБНИКА +ЯБЛОКО 200 МЛ (фреш дня) 1 шт</description>
<shtrihCode>2400000037149</shtrihCode>
</item>
</products>
Upvotes: 1