Ludek Vodicka
Ludek Vodicka

Reputation: 1730

Yaml-cpp parsing doesn't work space is missing after colon

I have encountered problem in yaml-cpp parser. When I try to load following definition:

DsUniversity:
  university_typ: {type: enum, values:[Fachhochschule, Universitat, Berufsakademie]}
  students_at_university: {type: string(50)}

I'm getting following error:

Error: yaml-cpp: error at line 2, column 39: end of map flow not found

I tried to verify yaml validity on http://yaml-online-parser.appspot.com/ and http://yamllint.com/ and both services reports yaml as valid.

Problem is caused by missing space after "values:" definition. When yaml is updated to following format:

DsUniversity:
  university_typ: {type: enum, values: [Fachhochschule, Universitat, Berufsakademie]}
  students_at_university: {type: string(50)}

everything works as expected.

Is there any way how to configure/update/fix yaml-cpp parser to proceed also yamls with missing space after colon?

Added: It seems that problem is caused by requirement for empty char as separator. When I simplified testing snippet to

DsUniversity:[Fachhochschule, Universitat, Berufsakademie]

yaml-cpp parser reads it as one scalar value "DsUniversity:[Fachhochschule, Universitat, Berufsakademie]". When empty char is added after colon, yaml-cpp correctly loads element with sequence.

Upvotes: 3

Views: 7314

Answers (2)

Revin
Revin

Reputation: 115

I think it would be beneficial to parse scalar/keys differently immediately inside a flow map{, if you agree, vote here please.

https://github.com/yaml/yaml-spec/issues/267

Upvotes: 0

Jesse Beder
Jesse Beder

Reputation: 34044

yaml-cpp is correct here, and those online validators are incorrect. From the YAML 1.2 spec:

7.4.2. Flow Mappings

Normally, YAML insists the “:” mapping value indicator be separated from the value by white space. A benefit of this restriction is that the “:” character can be used inside plain scalars, as long as it is not followed by white space. This allows for unquoted URLs and timestamps. It is also a potential source for confusion as “a:1” is a plain scalar and not a key: value pair.

...

To ensure JSON compatibility, if a key inside a flow mapping is JSON-like, YAML allows the following value to be specified adjacent to the “:”. This causes no ambiguity, as all JSON-like keys are surrounded by indicators. However, as this greatly reduces readability, YAML processors should separate the value from the “:” on output, even in this case.

In your example, you're in a flow mapping (meaning a map surrounded by {}), but your key is not JSON-like: you just have a plain scalar (values is unquoted). To be JSON-like, the key needs to be either single- or double-quoted, or it can be a nested flow sequence or map itself.

In your simplified example,

DsUniversity:[Fachhochschule, Universitat, Berufsakademie]

both yaml-cpp and the online validators parse this correctly as a single scalar - in order to be a map, as you intend, you're required a space after the :.

Why does YAML require that space?

In the simple plain scalar case:

a:b

could be ambiguous: it could be read as either a scalar a:b, or a map {a: b}. YAML chooses to read this as a scalar so that URLs can be easily embedded in YAML without quoting:

http://stackoverflow.com

is a scalar (like you'd expect), not a map {http: //stackoverflow.com}!

In a flow context, there's one case where this isn't ambiguous: when the key is quoted, e.g.:

{"a":b}

This is called JSON-like because it's similar to JSON, which requires quotes around all scalars. In this case, YAML knows that the key ends at the end-quote, and so it can be sure that the value starts immediately.

This behavior is explicitly allowed because JSON itself allows things like

{"a":"b"}

Since YAML 1.2 is a strict superset of JSON, this must be legal in YAML.

Upvotes: 5

Related Questions