Dave
Dave

Reputation: 1264

linux / bash parse through json-like data

Here is some data that I have:

animal { 
    dog {
        body {
            parts {
                legs = old
                brain = average
                tail= curly
                }
   
            }
        }
    cat {
        body {
            parts {
                legs = new
                brain = average
                tail {
                    base=hairy
                    tip=nothairy
                }
   
            }
        }
    }
}

Notice the data is not really json as it has the following rules:

Is it even possible to parse this with awk or sed? I tried jq but it does not work as this isn't really true json data.

My goal is to display only "dog" and "cat". Based on them being the top values under "animal".

$ some-magical-command
dog
cat

Upvotes: 0

Views: 88

Answers (3)

Ed Morton
Ed Morton

Reputation: 204731

To do what you currently want and for ease of any future manipulation of your data, you could use any POSIX awk (for character classes) to convert your structure to JSON and then use jq on it:

$ cat tst.awk
BEGIN { print "{" }
!NF { next }
{
    sub(/[[:space:]]+$/,"")
    gsub(/[[:alnum:]_]+/,"\"&\"")
    gsub(/ *= */,": ")
    sub(/" *{/,"\": {")
}
(++nr) > 1 {
    sep = ( /"/ && (prev ~ /["}]$/) ? "," : "" )
    printf "%s%s%s", prev, sep, ORS
}
{ prev = $0 }
END { print prev ORS "}" }

$ awk -f tst.awk file
{
"animal": {
    "dog": {
        "body": {
            "parts": {
                "legs": "old",
                "brain": "average",
                "tail": "curly"
                }
            }
        },
    "cat": {
        "body": {
            "parts": {
                "legs": "new",
                "brain": "average",
                "tail": {
                    "base": "hairy",
                    "tip": "nothairy"
                }
            }
        }
    }
}
}

Current and some possible future uses:

$ awk -f tst.awk file | jq -r '.animal | keys[]'
cat
dog

$ awk -f tst.awk file | jq -r '.animal.dog.body.parts | keys[]'
brain
legs
tail

$ awk -f tst.awk file | jq -r '.animal.dog.body.parts'
{
  "legs": "old",
  "brain": "average",
  "tail": "curly"
}

$ awk -f tst.awk file | jq -r '.animal.cat.body.parts'
{
  "legs": "new",
  "brain": "average",
  "tail": {
    "base": "hairy",
    "tip": "nothairy"
  }
}

The above assumes your input always looks as shown in your question.

Upvotes: 1

rici
rici

Reputation: 241971

If you only need the second-level keys, and you're not too concerned about producing good error messages for erroneous inputs, then it's pretty straight-forward. The basic idea is this:

  1. There are three formats for an input line:

    • ID {
    • ID = value # where the = might not be space-separated
    • }
  2. As the lines are read, we keep track of nesting depth by incrementing a counter with the first line type and decrementing it with the third line type.

  3. When the nesting counter is 1, if the line has an ID field, we print it.

That can be done quite simply with an awk script. This script should be saved in a file with a name like level2_keys.awk; you can then execute the command awk -f level2_keys.awk /path/to/input/file. Note that all the rules end with next; to avoid rules following a match being evaluated.

$1 == "}"    { # Decrement nesting on close
               --nesting;
               next;
             }
/=/          { # Remove the if block if you don't want to print these keys.
               if (nesting == 1) {
                 gsub("=", " = ");    # Force = to be a field
                 print($1);
               }
               next;
             }
$2 == "{"    { # Increment nesting (and maybe print) on open
               if (nesting == 1) print($1);
               ++nesting;
               next;
             }
# NF is non-zero if the line is not blank.
NF           { print "Bad input at " NR ": '"$0"'" > "/dev/stderr"; }

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 247240

It's fairly close to syntax, if you feel like learning a new language.

set data {
    animal { 
        dog {
            body {
                parts {
                    legs = old
                    brain = large
                    tail= curly
                    }
       
                }
            }
        cat {
            body {
                parts {
                    legs = new
                    brain = tiny
                    tail {
                        base=hairy
                        tip=nothairy
                    }
       
                }
            }
        }
    }
}

set data [regsub -line -all {\s*=\s*(.+)} $data { "\1"}]

dict get $data animal dog body parts brain    ;# => large

I know some people who would argue about your classification of dog brains vs cat brains...

Upvotes: 1

Related Questions