Reputation: 5455
I am creating a script which need to parse the yaml output that the puppet outputs.
When I does a request agains example https://puppet:8140/production/catalog/my.testserver.no I will get some yaml back that looks something like:
--- &id001 !ruby/object:Puppet::Resource::Catalog
aliases: {}
applying: false
classes:
- s_baseconfig
...
edges:
- &id111 !ruby/object:Puppet::Relationship
source: &id047 !ruby/object:Puppet::Resource
catalog: *id001
exported:
and so on... The problem is when I do an yaml.load(yamlstream), I will get an error like:
yaml.constructor.ConstructorError: could not determine a constructor for the tag '!ruby/object:Puppet::Resource::Catalog'
in "<string>", line 1, column 5:
--- &id001 !ruby/object:Puppet::Reso ...
^
As far as I know, this &id001 part is supported in yaml.
Is there any way around this? Can I tell the yaml parser to ignore them? I only need a couple of lines from the yaml stream, maybe regex is my friend here? Anyone done any yaml cleanup regexes before?
You can get the yaml output with curl like:
curl --cert /var/lib/puppet/ssl/certs/$(hostname).pem --key /var/lib/puppet/ssl/private_keys/$(hostname).pem --cacert /var/lib/puppet/ssl/certs/ca.pem -H 'Accept: yaml' https://puppet:8140/production/catalog/$(hostname)
I also found some info about this in the puppet mailinglist @ http://www.mail-archive.com/[email protected]/msg24143.html. But I cant get it to work correctly...
Upvotes: 14
Views: 4817
Reputation: 19
Simple YAML parser:
with open("file","r") as file:
for line in file:
re= yaml.load('\n'.join(line.split('?')[1:-1]).replace('?','\n').replace('""','\'').replace('"','\''))
# print '\n'.join(line.split('?')[1:-1])
# print '\n'.join(line.split('?')[1:-1]).replace('?','\n').replace('""','\'').replace('"','\'')
print line
print re
Upvotes: -1
Reputation: 1657
I have emailed Kirill Simonov, the creator of PyYAML, to get help to parse Puppet YAML file.
He gladly helped with the following code. This code is for parsing Puppet log, but I'm sure you can modify it to parse other Puppet YAML file.
The idea is to create the correct loader for the Ruby object, then PyYAML can read the data after that.
Here goes:
#!/usr/bin/env python
import yaml
def construct_ruby_object(loader, suffix, node):
return loader.construct_yaml_map(node)
def construct_ruby_sym(loader, node):
return loader.construct_yaml_str(node)
yaml.add_multi_constructor(u"!ruby/object:", construct_ruby_object)
yaml.add_constructor(u"!ruby/sym", construct_ruby_sym)
stream = file('201203130939.yaml','r')
mydata = yaml.load(stream)
print mydata
Upvotes: 24
Reputation: 5455
I only needed the classes section. So I ended up creating this little python function to strip it out...
Hope its usefull for someone :)
#!/usr/bin/env python
import re
def getSingleYamlClass(className, yamlList):
printGroup = False
groupIndent = 0
firstInGroup = False
output = ''
for line in yamlList:
# Count how many spaces in the beginning of our line
spaceCount = len(re.findall(r'^[ ]*', line)[0])
cleanLine = line.strip()
if cleanLine == className:
printGroup = True
groupIndent = spaceCount
firstInGroup = True
if printGroup and (spaceCount > groupIndent) or firstInGroup:
# Strip away the X amount of spaces for this group, so we get valid yaml
output += re.sub(r'^[ ]{%s}' % groupIndent, '', line) + '\n'
firstInGroup = False # Reset this
else:
# End of our group, reset
groupIndent = 0
printGroup = False
return output
getSingleYamlClass('classes:', open('puppet.yaml').readlines())
Upvotes: 1
Reputation: 11
I believe the crux of the matter is the fact that puppet is using yaml "tags" for ruby-fu, and that's confusing the default python loader. In particular, PyYAML has no idea how to construct a ruby/object:Puppet::Resource::Catalog, which makes sense, since that's a ruby object.
Here's a link showing some various uses of yaml tags: http://www.yaml.org/spec/1.2/spec.html#id2761292
I've gotten past this in a brute-force approach by simply doing something like:
cat the_yaml | sed 's#\!ruby/object.*$##gm' > cleaner.yaml
but now I'm stuck on an issue where the *resource_table* block is confusing PyYAML with its complex keys (the use of '? ' to indicate the start of a complex key, specifically).
If you find a nice way around that, please let me know... but given how tied at the hip puppet is to ruby, it may just be easier to do you script directly in ruby.
Upvotes: 1