absurd
absurd

Reputation: 1115

Ansible filter a list of dictionaries to only contain unique values in one field

I have a list of dictionaries in an ansible variable. Some dictionaries have the same value in the field 'id and the field 'name' while they differ in other key value pairs (that are not important for me). I want to filter out all those dictionaries that are "duplicates" regarding the 'name' and 'id' fields.

Example:

[{
        "name": "abc",
          "id": "123456",
   "other_key": "unimportant value"
  },
  {
        "name": "abc",
          "id": "123456",
   "other_key": "another unimportant value"
  },
  {
        "name": "bcd",
          "id": "789012",
   "other_key": "unimportant value"
  }]

Desired result:

[{
        "name": "abc",
          "id": "123456"
  },
  {
        "name": "bcd",
          "id": "789012"
  }]

How can I achieve this in Ansible? (the 'other_key' variable does not necessarily have to be discarded it could also be e.g. just the first occurrence, it just does not matter).

I already produced a list of unique ids with:

{{ mydictionaries | map(attribute='id') | unique | list }}

But how do I filter the list of dictionaries with this?

Upvotes: 3

Views: 6803

Answers (2)

Vladimir Botka
Vladimir Botka

Reputation: 68044

Q: "Filter out dictionaries that are "duplicates" regarding the 'name' and 'id' fields."

Update.

A: Given the list

  my_list:
    - {id: '123456', name: abc, other_key: unimportant value}
    - {id: '123456', name: abc, other_key: another unimportant value}
    - {id: '789012', name: bcd, other_key: unimportant value}

declare the variables

  my_hash: "{{ my_list|json_query('[].[name, id]')|
                       map('join')|
                       map('hash')|
                       map('community.general.dict_kv', 'hash') }}"
  my_list2: "{{ my_list|zip(my_hash)|map('combine') }}"
  my_dict3: "{{ dict(my_list2|groupby('hash')|
                              map('last')|
                              map('first')|
                              json_query('[].[name, id]')) }}"
  my_list3: "{{ my_dict3|dict2items(key_name='name', value_name='id') }}"

give

  my_hash:
    - {hash: 370194ff6e0f93a7432e16cc9badd9427e8b4e13}
    - {hash: 370194ff6e0f93a7432e16cc9badd9427e8b4e13}
    - {hash: f3814d5b43a4d67d7636ec64be828d82a92eedbb}

  my_list2:
  - hash: 370194ff6e0f93a7432e16cc9badd9427e8b4e13
    id: '123456'
    name: abc
    other_key: unimportant value
  - hash: 370194ff6e0f93a7432e16cc9badd9427e8b4e13
    id: '123456'
    name: abc
    other_key: another unimportant value
  - hash: f3814d5b43a4d67d7636ec64be828d82a92eedbb
    id: '789012'
    name: bcd
    other_key: unimportant value

  my_dict3:
    abc: '123456'
    bcd: '789012'

  my_list3:
    - {id: '123456', name: abc}
    - {id: '789012', name: bcd}

Optionally, convert id from string to number

  my_dict3: "{{ dict(my_list2|groupby('hash')|
                              map('last')|
                              map('first')|
                              json_query('[].[name, to_number(id)]')) }}"
  my_list3: "{{ my_dict3|dict2items(key_name='name', value_name='id') }}"

give

  my_dict3:
    abc: 123456
    bcd: 789012

  my_list3:
    - {id: 123456, name: abc}
    - {id: 789012, name: bcd}

Example of a complete playbook for testing

- hosts: localhost

  vars:

    my_list:
      - {id: '123456', name: abc, other_key: unimportant value}
      - {id: '123456', name: abc, other_key: another unimportant value}
      - {id: '789012', name: bcd, other_key: unimportant value}

    my_hash: "{{ my_list|json_query('[].[name, id]')|
                         map('join')|
                         map('hash')|
                         map('community.general.dict_kv', 'hash') }}"
    my_list2: "{{ my_list|zip(my_hash)|map('combine') }}"
    my_dict3: "{{ dict(my_list2|groupby('hash')|
                                map('last')|
                                map('first')|
                                json_query('[].[name, to_number(id)]')) }}"
                              # json_query('[].[name, id]')) }}"
    my_list3: "{{ my_dict3|dict2items(key_name='name', value_name='id') }}"

  tasks:

    - debug:
        var: my_list|to_yaml
    - debug:
        var: my_hash|to_yaml
    - debug:
        var: my_list2
    - debug:
        var: my_dict3
    - debug:
        var: my_list3|to_yaml

Origin.

Given the data is stored in the variable my_list, let's add the hash attribute, created from the name and id attributes, to the list. For example,

    - set_fact:
        my_list2: "{{ my_list2|default([]) +
                      [item|combine({'hash': (item.name ~ item.id)|hash})] }}"
      loop: "{{ my_list }}"
    - debug:
        var: my_list2

give

  my_list2:
  - hash: 370194ff6e0f93a7432e16cc9badd9427e8b4e13
    id: '123456'
    name: abc
    other_key: unimportant value
  - hash: 370194ff6e0f93a7432e16cc9badd9427e8b4e13
    id: '123456'
    name: abc
    other_key: another unimportant value
  - hash: f3814d5b43a4d67d7636ec64be828d82a92eedbb
    id: '789012'
    name: bcd
    other_key: unimportant value

Subsequent use groupby filter and select the required attributes. For example,

    - set_fact:
        my_list3: "{{ my_list3|default([]) +
                      [{'name': item.1.0.name, 'id': item.1.0.id}] }}"
      loop: "{{ my_list2|groupby('hash') }}"
    - debug:
        var: my_list3

give

  my_list3:
    - {id: '123456', name: abc}
    - {id: '789012', name: bcd}

Example of a complete playbook for testing

- hosts: localhost

  vars:

    my_list:
      - {id: '123456', name: abc, other_key: unimportant value}
      - {id: '123456', name: abc, other_key: another unimportant value}
      - {id: '789012', name: bcd, other_key: unimportant value}

  tasks:

    - set_fact:
        my_list2: "{{ my_list2|default([]) +
                      [item|combine({'hash': (item.name ~ item.id)|hash})] }}"
      loop: "{{ my_list }}"
    - debug:
        var: my_list2

    - set_fact:
        my_list3: "{{ my_list3|default([]) +
                      [{'name': item.1.0.name, 'id': item.1.0.id}] }}"
      loop: "{{ my_list2|groupby('hash') }}"
    - debug:
        var: my_list3|to_yaml

Upvotes: 2

Zeitounator
Zeitounator

Reputation: 44615

You can filter only the desired keys from your map list with json_query and then apply the unique filter.

{{ mydictionnaries | json_query('[].{"name": name, "id": id}') | unique }}

Below a proof of concept playbook. Please note in the above doc that json_query requires pip install jmespath on the ansible controller.

---
- name: Unique filtered dictionaries list example
  hosts: localhost
  gather_facts: false

  vars:
    mydictionaries: [{"name": "abc","id": "123456","other_key": "unimportant value"},{"name": "abc","id": "123456","other_key": "another unimportant value"},{"name": "bcd","id": "789012","other_key": "unimportant value"}]

  tasks:
    - name: Filter out list as wanted
      debug:
        msg: >-
          {{
            mydictionaries
            | json_query('[].{"name": name, "id": id}')
            | unique
          }}

which gives

PLAY [Unique filtered dictionaries list example] *************************************************************************************************************************************************************************************************************************************

TASK [Filter out list as wanted] ****************************************************************************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": [
        {
            "id": "123456",
            "name": "abc"
        },
        {
            "id": "789012",
            "name": "bcd"
        }
    ]
}

PLAY RECAP **************************************************************************************************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

Upvotes: 2

Related Questions