somenxavier
somenxavier

Reputation: 1557

knowing if file is YAML or not

I would like to expect that YAML.load_file(foo) of ruby YAML module returns null if foo is not a YAML file. But I get exception:

did not find expected alphabetic or numeric character while scanning an alias at line 3 column 3 (Psych::SyntaxError)
    from /usr/lib/ruby/2.4.0/psych.rb:377:in `parse_stream'
    from /usr/lib/ruby/2.4.0/psych.rb:325:in `parse'
    from /usr/lib/ruby/2.4.0/psych.rb:252:in `load'
    from /usr/lib/ruby/2.4.0/psych.rb:473:in `block in load_file'
    from /usr/lib/ruby/2.4.0/psych.rb:472:in `open'
    from /usr/lib/ruby/2.4.0/psych.rb:472:in `load_file'
    from ./select.rb:27:in `block in selecting'
    from ./select.rb:26:in `each'
    from ./select.rb:26:in `selecting'
    from ./select.rb:47:in `block (2 levels) in <main>'
    from ./select.rb:46:in `each'
    from ./select.rb:46:in `block in <main>'
    from ./select.rb:44:in `each'
    from ./select.rb:44:in `<main>'

How can I triage if a file is a YAML file or not without a exception? In my case, I navigate to a directory and process markdown files: I add to a list markdown files with a key output: word and I return that list

 mylist = Array.new
 mylist = []
 for d in (directory - excludinglist)
 begin
   info = YAML.load_file(d)
   if info
     if info.has_key?('output')
       if info['output'].has_key?(word)
         mylist.push(d)
       end
     end
   end
 rescue Psych::SyntaxError => error
   return []
 end
end
return mylist

When I catch exceptions, the bucle does not continue to push elements on my list.

Upvotes: 0

Views: 329

Answers (2)

eiko
eiko

Reputation: 5345

The short answer: you can't.

Because YAML is just a text file, the only way to know whether a given text file is YAML or not is to parse it. The parser will try to parse the file, and if it is not valid YAML, it will raise an error.

Errors and exceptions are a common part of Ruby, especially in the world of IO. There's no reason to be afraid of them. You can easily rescue from them and continue on your way:

begin
  yaml = YAML.load_file(foo)
rescue Psych::SyntaxError => e
  # handle the bad YAML here
end

You mentioned that the following code will not work because you need to handle multiple files in a directory:

def foo
  mylist = []
  for d in (directory - excludinglist)
    begin
      info = YAML.load_file(d)
      if info
        if info.has_key?('output')
          if info['output'].has_key?(word)
            mylist.push(d)
          end 
        end
      end 
    rescue Psych::SyntaxError => error 
      return [] 
    end 
  return mylist
end

The only issue here is that when you hit an error, you respond by returning from the function early. If you don't return, the for-loop will continue and you will get your desired functionality:

def foo
  mylist = []
  for d in (directory - excludinglist)
    begin
      info = YAML.load_file(d)
      if info
        if info.has_key?('output')
          if info['output'].has_key?(word)
            mylist.push(d)
          end 
        end
      end 
    rescue Psych::SyntaxError => error 
      # do nothing!
      # puts "or your could display an error message!"
    end
  end
  return mylist
end

Upvotes: 2

max pleaner
max pleaner

Reputation: 26778

Psych::SyntaxError gets raised by Psych::Parser#parse, the source for which is written in C. So unless you want to work with C, you can't write a patch for the method in Ruby to prevent the exception from getting raised.

Still, you could certainly rescue the exception, like so:

begin
  foo = YAML.load_file("not_yaml.txt")
rescue Psych::SyntaxError => error
  puts "bad yaml"
end

Upvotes: 2

Related Questions