Diego
Diego

Reputation: 964

How to override the base resolver in pyyaml

I have found several comments and a similar question on how to override the resolver. https://github.com/yaml/pyyaml/issues/376#issuecomment-576821252

@StrayDragon Actually you can change this behavior if you want by overriding the bool tag regex in the base resolver (https://github.com/yaml/pyyaml/blob/master/lib/yaml/resolver.py#L170-L175), then wire up your resolver instead of the default on a new loader instance (which gets passed to yaml.load().

I understand the theory, but it is not working when I try to implement it.

This is my python code:

 import yaml
import re

class CustomLoader(yaml.FullLoader):
    pass

# Override the boolean tag regex
yaml.resolver.BaseResolver.add_implicit_resolver(
    'tag:yaml.org,2002:bool',
    re.compile(
        r'''^(?:yes|Yes|YES|no|No|NO
            |true|True|TRUE|false|False|FALSE
            |off|Off|OFF)$''',
        re.X
    ),
    list('yYnNtTfFoO')
)

# Read the workflow YAML file using the custom loader
with open(file_path, 'r') as file:
    yaml_content = yaml.load(file, Loader=CustomLoader)

And I thought it will work, but my original YML file

name: cloud
on:
  push:
  workflow_dispatch:
concurrency:
  group: "${{ github.ref }}"

Is still being replaced

name: cloud-cicd-ghc-test/diego-test-migrate-to-github-ci
True:
  push: null
  workflow_dispatch: null
concurrency:
  group: "${{ github.ref }}"

So I'm not sure how to do this process.

Upvotes: 0

Views: 267

Answers (2)

tinita
tinita

Reputation: 4346

To solve the original problem, I would suggest to try out yamlcore to read YAML 1.2 files. YAML 1.2 does not differ only regarding on, but a lot more.

import yaml
from yamlcore import CoreLoader
d = yaml.load("on: something", Loader=CoreLoader)
print(d)
# {'on': 'something'}

Upvotes: 0

Oluwafemi Sule
Oluwafemi Sule

Reputation: 38982

yaml.loader.FullLoader subclasses the yaml.resolver.Resolver which configures default implicit resolvers for tags.

The BaseResolver.add_implicit_resolver function append to a list resolvers for a tag.

Our target is to replace the resolver for the tag:yaml.org,2002:bool and not add to the configured list.

We can access this list and filter out the configured entries for tag:yaml.org,2002:bool before configuring our custom regular expression of it.

import re
from typing import Type

import yaml


class CustomLoader(yaml.FullLoader):
    pass


def modify_implicit_resolver(loader_cls: Type[yaml.resolver.BaseResolver],
                             tag: str, regexp: re.Pattern, first: list[str]):
    """ Modifies the regexp for the provided tag 
        in the implicit resolvers configuration.
    """
    ref = loader_cls.yaml_implicit_resolvers
    # Reset to empty state
    for key in first:
        # Exclude bool tags from default implicit resolvers
        ref[key] = [(_tag, _regexp) for _tag, _regexp in ref[key]
                    if _tag != tag]

    loader_cls.add_implicit_resolver(tag, regexp, first)


# Override the boolean tag regex
modify_implicit_resolver(
    CustomLoader, 'tag:yaml.org,2002:bool',
    re.compile(
        r'''^(?:yes|Yes|YES|no|No|NO
            |true|True|TRUE|false|False|FALSE
            |off|Off|OFF)$''', re.X), list('yYnNtTfFoO'))


def main():
    # Read the workflow YAML file using the custom loader
    file_path = './sample.yaml'
    with open(file_path, 'r') as file:
        yaml_content = yaml.load(file, Loader=CustomLoader)
    print(yaml.dump(yaml_content))

main()

We could also have our CustomLoader subclass a Loader class in the yaml.loader module that subclasses the BaseResolver if we don't care for a Resolver that has implicit resolvers pre-configured.

Upvotes: 0

Related Questions