Dick Colt
Dick Colt

Reputation: 1555

L O S T :( Dumped some data with Ruby YAML, can't read it back

so I saved to disk some objects using the following code (this is Ruby 1.9.2 on Windows BTW):

open('1.txt', "wb") { |file|
    file.write(YAML::dump( results))
}

Now I'm trying to get back that data, but get 'invalid byte sequence in UTF-8 (ArgumentError)'. I've tryed everything I could think of to save the data in different format, but no luck. For example

open('1.txt', 'rb'){|f| a1 = YAML::load(f.read)}
a1.each do |a|
    JSON.generate(a)
end

results in:

 C:/m/ruby-1.9.2-p0-i386-mingw32/lib/ruby/1.9.1/json/common.rb:212:in `match':
invalid byte  sequence 
in UTF-8 (ArgumentError)
    from C:/m/ruby-1.9.2-p0-i386-mingw32/lib/ruby/1.9.1/json/common.rb:212:in `generate'
    from C:/m/ruby-1.9.2-p0-i386-mingw32/lib/ruby/1.9.1/json/common.rb:212:in `generate'
    from merge3.rb:31:in `block in <main>'
    from merge3.rb:29:in `each'
    from merge3.rb:29:in `<main>'

What can I do?

EDIT: from the file:

--- 
- !ruby/object:Product 
  name: HSF
- !ruby/object:Product
  name: "almer\xA2n"

The 1st product works OK, but the 2nd gives the exception.

Upvotes: 0

Views: 630

Answers (3)

the Tin Man
the Tin Man

Reputation: 160571

I'm not sure if this is what you're after, but currently your YAML file looks like:

--- 
- !ruby/object:Product 
  name: HSF
- !ruby/object:Product
  name: "almer\xA2n"

If you remove the !ruby/object:Product from the array lines you'll get an array of hashes:

--- 
- name: HSF
- name: "almer\xA2n"

results in:

YAML::load_file('test.yaml') #=> [{"name"=>"HSF"}, {"name"=>"almer\xA2n"}]

If I print the second element's value when my terminal is set to Windows character sets I see the cent sign. So, if you're trying to regain access to the data all you have to do is a bit of manipulation of the data file.

Upvotes: 0

tadman
tadman

Reputation: 211670

This is probably your encoding being wrong. You could try this:

Encoding.default_external = 'BINARY'

This should read in the file raw, not interpreted as UTF-8. You are presumably using some kind of ISO-8859-1 accent.

Upvotes: 1

J&#246;rg W Mittag
J&#246;rg W Mittag

Reputation: 369526

You need to read the file back in using the same encoding you used to write it out, obviously. Since you don't specify an encoding in either case, you will basically end up with an environment-dependent encoding outside of your control, which is why it is never a good idea to not specify an encoding.

The snippet you posted is clearly not valid UTF-8, so the fact that you get an exception is perfectly appropriate.

Upvotes: 0

Related Questions