Ninjaxor
Ninjaxor

Reputation: 998

How do I check if a string is valid YAML?

I'd like to check if a string is valid YAML. I'd like to do this from within my Ruby code with a gem or library. I only have this begin/rescue clause, but it doesn't get rescued properly:

def valid_yaml_string?(config_text)
  require 'open-uri'
  file = open("https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration")
  hard_failing_bad_yaml = file.read
  config_text = hard_failing_bad_yaml
  begin
    YAML.load config_text
    return true
  rescue
    return false
  end
end

I am unfortunately getting the terrible error of:

irb(main):089:0> valid_yaml_string?("b")
Psych::SyntaxError: (<unknown>): mapping values are not allowed in this context at line 6 column 19
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse_stream'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:151:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:127:in `load'
from (irb):83:in `valid_yaml_string?'
from (irb):89
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/bin/irb:12:in `<main>'

Upvotes: 4

Views: 8506

Answers (1)

the Tin Man
the Tin Man

Reputation: 160551

Using a cleaned-up version of your code:

require 'yaml'
require 'open-uri'

URL = "https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration"

def valid_yaml_string?(yaml)
  !!YAML.load(yaml)
rescue Exception => e
  STDERR.puts e.message
  return false
end

puts valid_yaml_string?(open(URL).read)

I get:

(<unknown>): mapping values are not allowed in this context at line 6 column 19
false

when I run it.

The reason is, the data you are getting from that URL isn't YAML at all, it's HTML:

open('https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration').read[0, 100]
=> "  \n\n\n<!DOCTYPE html>\n<html>\n  <head prefix=\"og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# githubog:"

If you only want a true/false response whether it's parsable YAML, remove this line:

STDERR.puts e.message

Unfortunately, going beyond that and determining if the string is a YAML string gets harder. You can do some sniffing, looking for some hints:

yaml[/^---/m]

will search for the YAML "document" marker, but a YAML file doesn't have to use those, nor do they have to be at the start of the file. We can add that in to tighten up the test:

!!YAML.load(yaml) && !!yaml[/^---/m]

But, even that leaves some holes, so adding in a test to see what the parser returns can help even more. YAML could return an Fixnum, String, an Array or a Hash, but if you already know what to expect, you can check to see what YAML wants to return. For instance:

YAML.load(({}).to_yaml).class
=> Hash
YAML.load(({}).to_yaml).instance_of?(Hash)
=> true

So, you could look for a Hash:

parsed_yaml = YAML.load(yaml)
!!yaml[/^---/m] && parsed_yaml.instance_of(Hash) 

Replace Hash with whatever type you think you should get.

There might be even better ways to sniff it out, but those are what I'd try first.

Upvotes: 8

Related Questions