Reputation: 440
I'd like to be able to take the YAML defined below and turn it into a dictionary.
development:
user:dev_uid
pass:dev_pwd
host:127.0.0.1
database:dev_db
production:
user:uid
pass:pwd
host:127.0.0.2
database:db
I have been able to use the YAML library to load the data in. However, my dictionary appears to contain the environmental items as a long string.
This code:
#!/usr/bin/python3
import yaml
config = yaml.load(open('database.conf', 'r'))
print(config['development'])
yields the following output.
user:dev_uid pass:dev_pwd host:127.0.0.1 database:dev_db
I can't access any of the entries by key name or load that string subsequent using the yaml.load
method.
print(config['development']['user'])
This code yields the following error:
TypeError: string indices must be integers
Ideally I would like to end up with a parsing function that returns a dictionary or a list
so I can access the properties by key name or using the dot
operator like:
print(config['development']['user'])
config.user
Where am I going wrong?
Upvotes: 6
Views: 21790
Reputation: 362478
Your "yaml" is not a mapping of mappings, it's a mapping of strings. In YAML 1.2, block mapping entries need whitespace after the separator, e.g.
development:
user: dev_uid
pass: dev_pwd
host: 127.0.0.1
database: dev_db
production:
user: uid
pass: pwd
host: 127.0.0.2
database: db
Don't try to pre-process this text. Instead, find who generated the markup and throw the spec at them.
Upvotes: 10
Reputation: 76578
Your YAML is absolutely valid, and that is why you won't get an error when loading this. That it doesn't load as you expect is because YAML has a feature to wrap (long) lines at whitespace and this works for unquoted scalars such as your
user:dev_uid
pass:dev_pwd
host:127.0.0.1
database:dev_db
Your YAML file is equivalent to:
development: "user:dev_uid pass:dev_pwd host:127.0.0.1 database:dev_db" production: "user:uid pass:pwd host:127.0.0.2 database:db"
and to
development: user:dev_uid pass:dev_pwd host:127.0.0.1 database:dev_db production: user:uid pass:pwd host:127.0.0.2 database:db
as quotes are not necessary, since there can be no confusion about the value for development
to be a mapping, as for that the colon after the key should be followed by a space. This can be seen from the older (now outdated) YAML 1.1 specification that was used to implement PyYAML¹.
Best is to convert, correct the YAML which can be easily done if you can assume that none of the keys and values have embedded spaces:
import sys
import yaml
yaml_str = """\
development:
user:dev_uid
pass:dev_pwd
host:127.0.0.1
database:dev_db
production:
user:uid
pass:pwd
host:127.0.0.2
database:db
"""
data = yaml.safe_load(yaml_str)
for key in data:
val = data[key]
if ':' not in val:
continue
data[key] = tmp = {}
for x in val.split():
x = x.split(':', 1)
tmp[x[0]] = x[1]
yaml.safe_dump(data, sys.stdout, default_flow_style=False)
If your file is more complicated that what you presented, you might have to recurs into dict values and list items, which is fairly trivial.
The above outputs:
development:
database: dev_db
host: 127.0.0.1
pass: dev_pwd
user: dev_uid
production:
database: db
host: 127.0.0.2
pass: pwd
user: uid
which then loads as you expect without the hassle.
¹The newer YAML 1.2 allows key-value pairs without a space after the colon when using flow-style mappings. But the pre-requisite for that is that both key and value are (double) quoted. This change was necessary to allow YAML 1.2 compatibility with JSON:
development: {
"user":"dev_uid",
"pass":"dev_pwd",
"host":"127.0.0.1",
"database":"dev_db"
}
Upvotes: -2
Reputation: 1172
Since you are not getting what you want with the yaml
module immediately, your .conf file is probably using a format different than what the yaml
module currently expects.
This code is a quick workaround that gives you the dictionary you want:
for mainkey in ['production','development']:
d = {}
for item in config[mainkey].split():
key,value = item.split(':')
d[key] = value
config[mainkey] = d
Upvotes: 2