Janko
Janko

Reputation: 9305

What is the best way to parse a YAML-like string in Ruby?

I want to parse output from identify -verbose (ImageMagick) and return it as a Hash. The output looks something like this:

  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Mime type: image/jpeg
  Class: DirectClass
  Geometry: 200x276+0+0
  Resolution: 300x300
  Print size: 0.666667x0.92
  Units: PixelsPerInch
  Type: TrueColor
  Endianess: Undefined
  Colorspace: sRGB
  Depth: 8-bit
  Channel depth:
    red: 8-bit
    green: 8-bit
    blue: 8-bit
  Channel statistics:
    Pixels: 55200
    Red:
      min: 0 (0)
      max: 255 (1)
      mean: 53.5216 (0.209889)
      standard deviation: 50.4831 (0.197973)
      kurtosis: 1.76124
      skewness: 1.173

Now, I tried to cheat and use YAML.load on it, but I get a ParseError (it was a long shot anways, who knows what can cause it not to be valid YAML).

So, is there an elegant way to parse this into a nested hash? I want to get the output like this:

{
  "Format" => "JPEG (Joint Photographic Experts Group JFIF format)",
  "Mime type" => "image/jpeg",
  "Class" => "DirectClass",
  ...
  "Channel depth" => {
    "red": "8-bit",
    "green": "8-bit",
    "blue": "8-bit",
  },
  ...
}

Upvotes: 0

Views: 116

Answers (1)

maerics
maerics

Reputation: 156434

Try converting it to valid YAML. This output is so close, in fact, that if you replace the line Image: <name> with Image:\n Name: <name> it can be parsed directly.

Here's a kludge that will work for your example at least:

require 'yaml'

def parse_image_magick_output(str)
  YAML::load(str.sub(/Image:\s*(.*?)$/m, "Image:\n  Name: \\1"))
end

pp parse_image_magick_output(get_im_output) # =>
# {"Image"=>
#   {"Name"=>"spec/fixtures/default.jpg",
#    "Format"=>"JPEG (Joint Photographic Experts Group JFIF format)",
#    "Mime type"=>"image/jpeg",
#    ...
#        "standard deviation"=>"50.4831 (0.197973)",
#        "kurtosis"=>1.76124,
#        "skewness"=>1.173}}}}

Of course, different ImageMagick subcommands might produce similarly non-YAML output (e.g. not just this case of Image/name), so there's no guarantee this example will work in general (although you could probably match on the general case by doing a lookahead for what should be a YAML mapping with a little more effort).

Upvotes: 1

Related Questions