Jiri Pencak
Jiri Pencak

Reputation: 21

json filtering in bash

I would like to ask you if you can help me with filtering data in json. I have the following json:

{
    "cluster_name": "k8s-cluster-1",
    "cluster_env": "PROD",
    "cluster_hosts": [
      {
        "host_type": "MASTER",
        "host_hostname": "aa083",
        "host_username": "kubeadm",
        "host_kubernetes_version": "",
      },
      {
        "host_type": "NODE",
        "host_hostname": "aa084",
        "host_username": "kubeadm",
        "host_kubernetes_version": "",
      },
      {
        "host_type": "NODE",
        "host_hostname": "aa085",
        "host_username": "kubeadm",
        "host_kubernetes_version": "",
      }
    ],
},
{
    "cluster_name": "k8s-cluster-2",
    "cluster_env": "PROD",
    "cluster_hosts": [
      {
        "host_type": "MASTER",
        "host_hostname": "ab093",
        "host_username": "kubeadm",
        "host_kubernetes_version": "",
      },
      {
        "host_type": "NODE",
        "host_hostname": "ab094",
        "host_username": "kubeadm",
        "host_kubernetes_version": "",
      },
      {
        "host_type": "NODE",
        "host_hostname": "ab095",
        "host_username": "kubeadm",
        "host_kubernetes_version": "",
      }
    ],
}

And the output should look like this:

cluster_name
host_type
host_name
.
.
.
k8s-cluster-1
MASTER
aa083
NODE
aa084
NODE
aa085
k8s-cluster-2
MASTER
ab093
NODE
ab094
NODE
ab095

I tried this command

cat json |jq -c '.[] | .cluster_name,.cluster_hosts[] | {host_type,host_name}'

but it shows an error

jq: error (at :1): Cannot index string with string "host_type"

I need to do this in bash.

Could you please help me?

Thank you in advance.

Upvotes: 0

Views: 447

Answers (1)

pmf
pmf

Reputation: 36033

Your actual input file has several superfluous commas. Assuming your JSON file should be a stream of objects, i.e. look like this:

{
    "cluster_name": "k8s-cluster-1",
    "cluster_env": "PROD",
    "cluster_hosts": [
      {
        "host_type": "MASTER",
        "host_hostname": "aa083",
        "host_username": "kubeadm",
        "host_kubernetes_version": ""
      },
      {
        "host_type": "NODE",
        "host_hostname": "aa084",
        "host_username": "kubeadm",
        "host_kubernetes_version": ""
      },
      {
        "host_type": "NODE",
        "host_hostname": "aa085",
        "host_username": "kubeadm",
        "host_kubernetes_version": ""
      }
    ]
}
{
    "cluster_name": "k8s-cluster-2",
    "cluster_env": "PROD",
    "cluster_hosts": [
      {
        "host_type": "MASTER",
        "host_hostname": "ab093",
        "host_username": "kubeadm",
        "host_kubernetes_version": ""
      },
      {
        "host_type": "NODE",
        "host_hostname": "ab094",
        "host_username": "kubeadm",
        "host_kubernetes_version": ""
      },
      {
        "host_type": "NODE",
        "host_hostname": "ab095",
        "host_username": "kubeadm",
        "host_kubernetes_version": ""
      }
    ]
}

Then you just need to list the fields you are interested in. Use the -r flag to output raw text instead of JSON strings (which would be quoted):

jq -r '.cluster_name, (.cluster_hosts[] | .host_type, .host_hostname)'
k8s-cluster-1
MASTER
aa083
NODE
aa084
NODE
aa085
k8s-cluster-2
MASTER
ab093
NODE
ab094
NODE
ab095

Demo

If instead your input file is supposed to be an array (in which case the top-level commas were correct but the array brackets [] were missing), then simply prepend the filter with .[] |.

Upvotes: 1

Related Questions