Reputation: 179
This is a YAML file. It contains a list of mappings from ticker to feature category.
Following is the mapping of BANKNIFTY_O_C_0_10_W:
index: [ BANKNIFTY_O_C_0_09_W: books,BANKNIFTY_O_C_0_09_W: trends,BANKNIFTY_O_C_0_09_W: trades,BANKNIFTY_O_C_0_09_W: relations,BANKNIFTY_O_P_0_09_W: books,BANKNIFTY_O_P_0_09_W: trends,BANKNIFTY_O_P_0_09_W: trades,BANKNIFTY_O_P_0_09_W: negrelations,BANKNIFTY_O_C_0_10_W: books,BANKNIFTY_O_C_0_10_W: trends,BANKNIFTY_O_C_0_10_W: trades,BANKNIFTY_O_C_0_10_W: relations,BANKNIFTY_O_C_0_10_W: options_banknifty_weekly,BANKNIFTY_O_P_0_10_W: books,BANKNIFTY_O_P_0_10_W: trends,BANKNIFTY_O_P_0_10_W: trades,BANKNIFTY_O_P_0_10_W: negrelations,BANKNIFTY_F_0: books,BANKNIFTY_F_0: trends,BANKNIFTY_F_0: trades,BANKNIFTY_F_0: relations,NIFTY_F_0: books,NIFTY_F_0: trends,NIFTY_F_0: trades,NIFTY_F_0: relations ]
I need the following output:
index:
- BANKNIFTY_O_C_0_09_W: [books, trends, trades, relations]
- BANKNIFTY_O_P_0_09_W: [books, trends, trades, negrelations]
- BANKNIFTY_O_C_0_10_W: [books, trends, trades, relations, options_banknifty_weekly]
- BANKNIFTY_O_P_0_09_W: [books, trends, trades, negrelations]
- BANKNIFTY_F_0: [books, trends, trades, relations]
- NIFTY_F_0: [books, trends, trades, relations]
Upvotes: 0
Views: 976
Reputation: 76578
Your input is a single item mapping, with as value a list of single item mappings.
Your output is a a list of single item mappings. That list is
ordered in the same way the keys of the original mappings appear. This indicates that gathering that information should be done using a list or OrderedDict
The corresponding values of those mappings is a list of original values for the keys of those mappings, also in the order they appear, but which at least partly repeat in the original, not in the target. Since the order needs preserving, a set
(which would automatically filter doubles), cannot be used. Instead a list could be used, which requires checking of an item already being in the list. However in the following I use another OrderedDict
, abused as "OrderedSet
" by not looking at the values.
The input is assumed to be in the file input.yaml
:
import sys
import pathlib
from collections import OrderedDict
import ruamel.yaml
yaml_file = pathlib.Path('input.yaml')
yaml = ruamel.yaml.YAML()
yaml.default_flow_style = None
data = yaml.load(yaml_file)
indexed = OrderedDict()
for elem in data['index']:
for k in elem: # just one each
single_item_map = indexed.setdefault(k, OrderedDict())
single_item_map[elem[k]] = None # arbitrary value, unused
l = []
for elem in indexed:
l.append({elem: [k for k in indexed[elem]]})
data['index'] = l
yaml.dump(data, sys.stdout)
which gives:
index:
- BANKNIFTY_O_C_0_09_W: [books, trends, trades, relations]
- BANKNIFTY_O_P_0_09_W: [books, trends, trades, negrelations]
- BANKNIFTY_O_C_0_10_W: [books, trends, trades, relations, options_banknifty_weekly]
- BANKNIFTY_O_P_0_10_W: [books, trends, trades, negrelations]
- BANKNIFTY_F_0: [books, trends, trades, relations]
- NIFTY_F_0: [books, trends, trades, relations]
The yaml.default_flow_style=None
is necessary as by default an instance YAML()
will use block style, whereas your output has flow style on the leaf-nodes. More fine tuned control is possible in ruamel.yaml
by not making "normal" dicts and lists but subclassing the objects internally used for keeping round-trip information. In your case this is not necessary as you want one of the three modes controlled by .default_flow_style
(False
: all-block-style, True
: all-flow-style, None
: block-style-with-leafs-in-flow-style)
Upvotes: 1