JGG
JGG

Reputation: 314

How to merge two yaml files while overriding duplicates using yq4

I have two files. The first file is a file where all the defaults are set. defauly_deploy.yaml

deploy:
  - env: prod
    disabled: true
    namespaces: 
      - production
    disable_switch:
      a: b
      c: d
      e: f
  - env: dev
    disabled: true
    namespaces: 
      - development
    disable_switch:
      a: b
      c: d
      e: f

I have another file, which is project specific. Since I have all the defaults in place and I , let's say, would like to just use the dev env. I then only add the below to the file, 'dev' one. deploy.yaml

deploy:
  - env: dev
    disabled: false
    namespaces: 
      - development
    disable_switch:
      a: b

If I load the file deploy.yaml into default_deploy.yaml I get the following:

$ yq eval '. *d load("deploy.yaml")' default_deploy.yaml
deploy:
  - env: dev
    disabled: false
    namespaces:
      - myns
    disable_switch:
      a: y
      c: d
      e: f
  - env: dev
    disabled: true
    namespaces:
      - development
    disable_switch:
      a: b
      c: d
      e: f

I would like to actually get:

deploy:
  - env: prod
    disabled: true
    namespaces:
      - production
    disable_switch:
      a: b
      c: d
      e: f
  - env: dev
    disabled: false
    namespaces:
      - myns
    disable_switch:
      a: y
      c: d
      e: f

I wonder if the problem is actually the use of lists. If I change the format to dict, it works like a charm, here's the new output:

deploy:
  prod:
    disabled: true
    namespaces:
      - production
    disable_switch:
      a: b
      c: d
      e: f
  dev:
    disabled: false
    namespaces:
      - myns
    disable_switch:
      a: y
      c: d
      e: f

Any clues? thx!!

Upvotes: 0

Views: 49

Answers (1)

pmf
pmf

Reputation: 36033

I wonder if the problem is actually the use of lists. If I change the format to dict, it works like a charm

Yes, because when merging arrays, yq can either append them to each other (using *+), or merge them by their index (which is why you saw the clash in the first item). yq by itself cannot determine that the .env subkey is the one to match on; this has to happen programmatically (see the manual on how to Merge arrays of objects together, matching on a key).

One option you've also found would be to convert both arrays into maps, perform the merging, and then re-construct from it the array again (if necessary).

Another way would be to iterate over the items, and perform a "local" merge on matching items. Here's an approach using ireduce to iterate, and select to filter for matching items:

yq '.deploy[] as $i ireduce (
  load("default_deploy.yaml"); (.deploy[] | select(.env == $i.env)) *=d $i
)' deploy.yaml

Yet another one would be to first perform a merge that appends the arrays (or simply add the two arrays if there's nothing besides the .deploy keys to merge), then group_by the reference key, and perform the inner merge on the groups found:

yq '. *+ load("deploy.yaml") | .deploy |= [group_by(.env)[] | .[0] *d .[1]]' \
  default_deploy.yaml

or

yq ea '[.deploy[]] | [group_by(.env)[] | .[0] *d .[1]] | {"deploy": .}' \
  default_deploy.yaml deploy.yaml

Upvotes: 0

Related Questions