Reputation: 103

Use awk to parse for vairable length

I can't quite wrap my head around this one. I need to parse one or more lines on or after line 9, not including "" or ,.

So if the output is:

{
    "hash" : "000000000fe549a89848c76070d4132872cfb6efe5315d01d7ef77e4900f2d39",
    "confirmations" : 88029,
    "size" : 189,
    "height" : 227252,
    "version" : 2,
    "merkleroot" : "c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a",
    "tx" : [
        "c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a"
],
    "time" : 1398824312,
    "nonce" : 1883462912,
    "bits" : "1d00ffff",
    "difficulty" : 1.00000000,
    "chainwork" : "000000000000000000000000000000000000000000000000083ada4a4009841a",
    "previousblockhash" : "00000000c7f4990e6ebf71ad7e21a47131dfeb22c759505b3998d7a814c011df",
    "nextblockhash" : "00000000afe1928529ac766f1237657819a11cfcc8ca6d67f119e868ed5b6188"
    }

I want c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a.

Or if the output is:

{
    "hash" : "000000000fe549a89848c76070d4132872cfb6efe5315d01d7ef77e4900f2d39",
    "confirmations" : 88029,
    "size" : 189,
    "height" : 227252,
    "version" : 2,
    "merkleroot" : "c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a",
    "tx" : [
        "c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a",
        "c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a",
        "c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a"
    ],
    "time" : 1398824312,
    "nonce" : 1883462912,
    "bits" : "1d00ffff",
    "difficulty" : 1.00000000,
    "chainwork" : "000000000000000000000000000000000000000000000000083ada4a4009841a",
    "previousblockhash" : "00000000c7f4990e6ebf71ad7e21a47131dfeb22c759505b3998d7a814c011df",
    "nextblockhash" : "00000000afe1928529ac766f1237657819a11cfcc8ca6d67f119e868ed5b6188"
}

I want c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a.

The numbers will always appear at least on line 9, but may extend well beyond it.

Note that I used hash c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a for clarity. The hash will be unique each time (but will always be the same length).

I would prefer answers with actual awk. No gawk. No Perl.

Upvotes: 0

Answers (3)

karakfa

Reputation: 67507

another awk based on the input format

$ awk -F' +: +' 'NF!=1{p=0} p&&!/]/{gsub(/"|,/,""); print} $1~/"tx"/{p=1}' json

        c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a

and for the other input

$ awk ... json2

        c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a
        c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a
        c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a

Upvotes: 0

Ed Morton

Reputation: 203985

$ cat tst.awk
/^[[:space:]]*\]/ { inTx=0 }
inTx { gsub(/^[^"]*"|"[^"]*$/,""); print }
/^[[:space:]]*"tx"[[:space:]]*:[[:space:]]*\[/ { inTx=1 }

$ awk -f tst.awk file
c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a
c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a
c738fb8e22750b6d3511ed0049a96558b0bc57046f3f77771ec825b22d6a6f4a

The above is just an implementation of the:

 awk '/end/{f=0} f{print} /start/{f=1}'

common awk idiom.

Upvotes: 2

hek2mgl

Reputation: 158080

The problem is complicated for you because you use the wrong tool. awk can't parse json, use jq for that:

jq -r .tx[] input.json

Upvotes: 1

Use awk to parse for vairable length

Answers (3)

Related Questions