sort a YAML block mapping sequence in python

I am trying to sort a YAML block mapping sequence in the way I want it... I would like to have something like

depth: !!opencv-matrix
    rows: 480
    cols: 640
    dt: f
    data: 'x'

but everytime I do dumping, it changes to

cols: 640
    data: 'x'
    depth: !!opencv-matrix
    dt: f
    rows: 480

I checked on a simple and easy way to do it here with

ordering = ['ymlFile','depth', 'rows', 'cols', 'dt', 'data']
ordered_set = [{'depth': '!!opencv-matrix'}, {'rows' : depthSizeImg[0]}, {'cols' : depthSizeImg[1]}, {'dt' : type(img_d[0][0])}, {'data': ymlList.tolist()}]]

f = open(out_d, 'a')
f.write('%YAML:1.0 \n')
f.write(yaml.dump(data, default_flow_style=None, allow_unicode=False, indent = 4))
f.close()

But it made the YAML not in a nested way.

%YAML:1.0 
- {depth: '!!opencv-matrix'}
- {rows: 323}
- {cols: 110}
- {dt: !!python/name:numpy.float32 ''}
- {data: 'x'}

How can I get the correct output?

Upvotes: 1

Views: 1269

Answers (1)

Anthon
Anthon

Reputation: 76802

In your example

ordered_set = [{'depth': '!!opencv-matrix'}, {'rows' : depthSizeImg[0]}, {'cols' : depthSizeImg[1]}, {'dt' : type(img_d[0][0])}, {'data': ymlList.tolist()}]]

You are dumping a list of dicts and that is what you get as YAML output. Calling a list ordered_set doesn't make it a set and including the YAML tags ( those !!object_name entries) in your data doesn't change them either.

The YAML specification uses !!omap (example 2.26) which combine the ordered structure of a sequence with single key mappings as elements:

depth: !!omap
  - rows: 480
  - cols: 640
  - dt: f
  - data: x

if you read that into PyYAML you get:

{'depth': [('rows', 480), ('cols', 640), ('dt', 'f'), ('data', 'x')]}

which means you cannot get the value of rows by simple keyword lookup. If you dump the above to YAML you get the even more ugly:

depth:
- !!python/tuple [rows, 480]
- !!python/tuple [cols, 640]
- !!python/tuple [dt, f]
- !!python/tuple [data, x]

and you cannot get around that with PyYAML without defining some mapping from !!omap to an ordereddict implementation and vv.

What you need is a more intelligent "Dumper" for your YAML ¹:

import ruamel.yaml as yaml

yaml_str = """\
depth: !!omap
  - rows: 480
  - cols: 640
  - dt: f
  - data: x
"""

data1 = yaml.load(yaml_str)
data1['depth']['data2'] = 'y'
print(yaml.dump(data1, Dumper=yaml.RoundTripDumper))

which gives:

depth: !!omap
- rows: 480
- cols: 640
- dt: f
- data: x
- data2: y

Or combine that with a smart loader (which doesn't throw away the ordering information existing in the input), and you can leave out the !!omap:

import ruamel.yaml as yaml

yaml_str = """\
depth:
  - rows: 480
  - cols: 640   # my number of columns
  - dt: f
  - data: x
"""

data3 = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
print(yaml.dump(data3, Dumper=yaml.RoundTripDumper))

which gives:

depth:
- rows: 480
- cols: 640     # my number of columns
- dt: f
- data: x

(including the preserved comment).


¹ This was done using ruamel.yaml of which I am the author. You should be able to do the example with data1 in PyYAML with some effort, the other example cannot be done without a major enhancement of PyYAML, which is exactly what ruamel.yaml is.

Upvotes: 1

Related Questions