Just-Insane
Just-Insane

Reputation: 3

How to parse nested dictionary inside a list in yaml?

I am parsing a YAML file to search for values at any key. Currently I can parse any dict at the first level, but cannot parse nested dictionaries.

I have tried modifying the example at https://stackoverflow.com/a/55608627 in order to parse the dictionary inside of the list, however this results in an error:

AttributeError: 'CommentedSeq' object has no attribute 'items'

When looking at the canonical output from http://yaml-online-parser.appspot.com/ it shows there is a map and a sequence, which I have been unable to account for.

The unmodified parsing function does not output any errors, however it does not see anything inside of the list.

The modified parsing function returns the AttributeError above.

Example YAML file: https://pastebin.com/BhwyPa7V

Full project: https://github.com/Just-Insane/helm-vault/blob/master/vault.py

Parsing function (unmodified):

def dict_walker(node, pattern, path=None):
    path = path if path is not None else ""
    for key, value in node.items():
        if isinstance(value, dict):
            dict_walker(value, pattern=pattern, path=f"{path}/{key}")
        elif value == pattern:
            if action == "enc":
                node[key] = input(f"Input a value for {path}/{key}: ")
                vault_write(node[key], path, key)
            elif (action == "dec") or (action == "view") or (action == "edit"):
                value = vault_read(path, key)
                node[key] = value

Parsing function (modified):

def dict_walker(node, pattern, path=None):
    path = path if path is not None else ""
    for key, value in node.items():
        if isinstance(value, dict):
            dict_walker(value, pattern=pattern, path=f"{path}/{key}")
        elif isinstance(value, list):
            for item in value:
                for value in dict_walker(value, pattern=pattern, path=f"{path}/{key}"):
                    if value == pattern:
                        if action == "enc":
                            node[key] = input(f"Input a value for {path}/{key}: ")
                            vault_write(node[key], path, key)
                        elif (action == "dec") or (action == "view") or (action == "edit"):
                            value = vault_read(path, key)
                            node[key] = value
        elif value == pattern:
            if action == "enc":
                node[key] = input(f"Input a value for {path}/{key}: ")
                vault_write(node[key], path, key)
            elif (action == "dec") or (action == "view") or (action == "edit"):
                value = vault_read(path, key)
                node[key] = value

Expected Results:

The nested dictionary is parsed and values inside are able to be modified successfully.

Actual Results:

  1. Using the unmodified code, values inside the list are not seen at all.

  2. Using the modified code, there is an attribute error caused by CommentedSeq. It is unclear why it is not being parsed as a list.

Upvotes: 0

Views: 3229

Answers (1)

Anthon
Anthon

Reputation: 76598

As I indicated in the answer you linked to, parsing is completely done even before the yaml.load() method returns. What you do is traversing the loaded data.

Your dict_walker() is based of the find() function, which only works for the rather uninteresting YAML input from the question I answered. It assumes that the YAML:

only consists of (plain) scalars that are strings, mappings, and mapping keys that are scalars.

The lookup function presented there can handle sequences, so that is what you need to base your dict_walker() function on (and that function as you have it has an appropriate name: it can only walk over dictionaries as invoking the .items() method on node assumes that node is a dict).

Assuming your example YAML in the file input.yaml, the following really walks the complete tree and reaches dictionaries nested within lists (created from mappings nested within sequences in your YAML):

import sys
from pathlib import Path
import ruamel.yaml

in_file = Path('input.yaml')

action = "view"

def vault_read(path, key):
    # dummy function to show functionality
    vault = {
        ("/spec/acme", "email"): "repl_0",
        ("/spec/acme/dns01/providers/0/cloudflare", "email"): "repl_1",
        ("/spec/acme/dns01/providers/0/cloudflare/apiKeySecretRef", "key"): 42,
    }
    return vault.get((path, key), "not found")

def walk_data(node, pattern, path=None):
   if path is None:
       path = ""
   if isinstance(node, dict):
       for key, value in node.items():
           if value == pattern:
               if action == "enc":
                   node[key] = input(f"Input a value for {path}/{key}: ")
                   vault_write(node[key], path, key)
               elif (action == "dec") or (action == "view") or (action == "edit"):
                   node[key] = vault_read(path, key)
           else:
               walk_data(value, pattern, path=f"{path}/{key}")
   elif isinstance(node, list):
       for idx, item in enumerate(node):
           walk_data(item, pattern, path=f"{path}/{idx}")


yaml = ruamel.yaml.YAML()
data = yaml.load(in_file)

walk_data(data, "changeme")

yaml.dump(data, sys.stdout)

which gives:

apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: repl_0
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      # Secret resource used to store the account's private key.
      name: letsencrypt-production
    # Enable the HTTP01 challenge mechanism for this Issuer
    dns01:
      providers:
      - name: prod-cloudflare
        cloudflare:
          email: repl_1
          apiKeySecretRef:
            name: cloudflare-api-key-secret
            key: 42

Upvotes: 1

Related Questions