anoam
anoam

Reputation: 139

Parse XML in Windows-1251 encoding?

Trying to run:

Nokogiri::XML(open("http://my.url.com/any/path.xml"))

For example:

Nokogiri::XML(open("http://bar-navig.yandex.ru/u?ver=2&show=32&url=google.com"))

But I get:

Nokogiri::XML::SyntaxError: Unsupported encoding windows-1251

But only on the server. On the local computer it works fine.

It looks like iconv supports this encoding:

iconv --list | grep 1251
CP1251 MS-CYRL WINDOWS-1251

And even if I try to run in bash

xmllint 'http://bar-navig.yandex.ru/u?ver=2&show=32&url=google.com'

It works fine.

Ruby 1.9.3 Rails 3.2.16 nokogiri 1.6.1 OS: FreeBSD 8.1

Here sample of code, on line 16. https://github.com/anoam/seo_params/blob/master/lib/seo_params/yandex.rb

And this is sample of URL: http://bar-navig.yandex.ru/u?ver=2&show=32&url=google.com

How can I solve it?

Upvotes: 0

Views: 1213

Answers (2)

anoam
anoam

Reputation: 139

The problem is solved here: https://github.com/sparklemotion/nokogiri/issues/1093

Thanks everyone!

Upvotes: 0

the Tin Man
the Tin Man

Reputation: 160551

Nokogiri::XML is a shortcut for Nokogiri::XML::Document.parse(), so look at the documentation for Nokogiri::XML::Document.parse()

parse(string_or_io, url = nil, encoding = nil, options = ParseOptions::DEFAULT_XML, &block) 

encoding (optional) is the encoding that should be used when processing the document.

Upvotes: 0

Related Questions