user2854333
user2854333

Reputation: 640

Data Formatting from Text File

I have data in below format stored in a file.

    ABC:9804
    {
      "count" : 492,
      "_shards" : {
        "total" : 19,
        "successful" : 19,
        "failed" : 0
      }
    }
    Bye
    ABC:95023
    {
      "count" : 865,
      "_shards" : {
        "total" : 19,
        "successful" : 19,
        "failed" : 0
      }
    }
    Bye
    ABCC:128
    {
      "count" : 479,
      "_shards" : {
        "total" : 19,
        "successful" : 19,
        "failed" : 0
      }
    }
    Bye

I am trying to get the output like

ABC:9804 , 492
ABC:95023 , 865
ABCC:128 , 479

I tried using awk to get the 1st like and 3rd line but that is not working .

Upvotes: 0

Views: 42

Answers (2)

Akshay Hegde
Akshay Hegde

Reputation: 16997

Input

$ cat infile
            ABC:9804
            {
              "count" : 492,
              "_shards" : {
                "total" : 19,
                "successful" : 19,
                "failed" : 0
              }
            }
            Bye
            ABC:95023
            {
              "count" : 865,
              "_shards" : {
                "total" : 19,
                "successful" : 19,
                "failed" : 0
              }
            }
            Bye
            ABCC:128
            {
              "count" : 479,
              "_shards" : {
                "total" : 19,
                "successful" : 19,
                "failed" : 0
              }
            }
            Bye

Output

Gawk

$ awk -F':' -v RS='[{},\n]' '/ABC.*|\"count"/{gsub(/[\n\t ]+/,""); printf /\"/? ", " $2 "\n": $0}' infile
ABC:9804, 492
ABC:95023, 865
ABCC:128, 479

Better Readable

awk -F':' -v RS='[{},\n]' '
                           /ABC.*|\"count"/{
                                 gsub(/[\n\t ]+/,""); 
                                 printf /\"/? ", " $2 "\n": $0
                           }
                          ' infile

non-gawk

$ awk -F'[ ,]' -v OFS=", " '/^[ \t]+(ABC.*|\"count\"[ ]?):/{ sub(/^[ \t]+/,"");  printf /\"/ ? OFS $(NF-1) RS:  $0  }' infile
ABC:9804, 492
ABC:95023, 865
ABCC:128, 479

Better Readable

awk -F'[ ,]' -v OFS=", " '
                         /^[ \t]+(ABC.*|\"count\"[ ]?):/{ 
                                    sub(/^[ \t]+/,"");  
                                    printf /\"/ ? OFS $(NF-1) RS:  $0  
                         }
                         ' infile

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

awk solution:

awk '/^ABC.*:/{ abc=$0 }$0~/"count"/{ gsub(/[^0-9]+/,"",$0); print abc" , "$0 }' file

The output:

ABC:9804 , 492
ABC:95023 , 865
ABCC:128 , 479

Upvotes: 2

Related Questions