VIPIN KUMAR
VIPIN KUMAR

Reputation: 3137

How to fetch the data between specific Pattern?

How to print the formatted data from a payload log?

Time: 1970-01-01T00
ID:sdafasdfsladfsdfas
Key:sadfljasdkfsf
record {'record1:32423
 record2:45245
 record3:dasfas'
}
Time: 1970-01-01T001
ID:sdafasdfsladfsdfas1
Key:sadfljasdkfsf1
record {'record1:32423
 record2:452451
 record3:dasfas1'
}

output that I am looking for is -

1970-01-01T00|sdafasdfsladfsdfas|sadfljasdkfsf|record1:32423
                                               record2:45245
                                               record3:dasfas
1970-01-01T001|sdafasdfsladfsdfas1|sadfljasdkfsf1|record1:32423
                                               record2:45245
                                               record3:dasfas1

I was able to achive the output using multiple command but I was looking for a right approach to get it in one command.

Upvotes: 0

Views: 79

Answers (2)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Awk solution:

awk -F': *' '/^(Time|ID|Key)/{ 
                 printf "%s|", $2; 
                 len += length($2)+1
            }
            /^record/{
                f=1; sub(/^[^{]+\{\047/, "");
                printf "%s\n", $0; next 
            }
            /\}/{ f=len=0 }
            f{ 
                gsub(/^ *|\047$/, "");
                printf "%+" len+length($0) "s\n", $0
            }' file
  • -F': *' - field separator
  • /^(Time|ID|Key)/ - on encountering one of the crucial keys (Time or ID or Key):
    • printf "%s|", $2 - print its value (presented by the 2nd field $2) followed by | in consequent manner without linebreak
    • len += length($2)+1 - accumulate length of each printed item (+1 - including trailing | char)
  • /^record/ - on encountering main record line:
    • f=1 - set flag f ensuring actively processed record section
    • sub(/^[^{]+\{\047/, "") - removing record {' sequence at start of the string
    • printf "%s\n", $0; next - print record:<value> sequence $0 preceding with previous printed values and ending with newline \n
    • next - jump to the next record
  • /\}/{ f=len=0 } - on encountering } char (as the end of a processed section) - reset all crucial variables
  • f{ ... } - on actively processed section:
    • gsub(/^ *|\047$/, "") - remove leading whitespace(s) and/or trailing single quote '
    • printf "%+" len+length($0) "s\n", $0- print subordinate record sequence $0 using total length len+length($0) as format indent length specifier

The output:

1970-01-01T00|sdafasdfsladfsdfas|sadfljasdkfsf|record1:32423
                                               record2:45245
                                               record3:dasfas
1970-01-01T001|sdafasdfsladfsdfas1|sadfljasdkfsf1|record1:32423
                                                  record2:452451
                                                  record3:dasfas1

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133518

Could you please try following awk and let me know if this helps you.

Solution 1st: Where output of record strings should come from 1st record's starting point:

awk -v s1="'" -F' |:' '    ##creating a variable named s1 whose value is single quote and setting field separator as space or colon here.
$0 ~ s1{                   ##Checking if line is having s1 variable in it then do following:
  gsub(s1,"")}             ##Global substituting s1 value with NULL in current line.
{
  sub(/^ +/,"")            ##Substituting initial space with NULL in all lines(if present).
}
/^}/{                      ##Checking if a line starts from } then do following:
  len=i=add=val=""         ##Nullifying variables len,i,add and val here.
}
/Time/||/ID/||/Key/{       ##Checking condition here if a line has string(s) Time or ID or Key in it then do following:
  val=val?val "|" $NF:$NF  ##Creating variable named val whose value is concatenating its own value each time with current lines last field value.
}
/^record /{                ##Checking condition here if a line is having string record in it and it starts from it then do following:
  sub(/{/,"");             ##Substitute { with NULL to current line.
  add=length($NF)>prev?length($NF)-prev:(prev?prev:prev-length($NF)); ##creating variable named add and checking condition length($NF)>prev if this is true then do length($NF)-prev otherwise check if variable prev is present then do prev or do a prev-length($NF).
  val=val?val "|" $2 ":"  $3:$2 ":" $3; ##Creating variable val here whose value is concatenating its own values and will add $2 ":" $3 each time too.
  prev=length($NF)         ##creating variable named prev which will have the length of last field in it.
}
/^record[0-9]+/{           ##Checking condition here if a line starts from string record and followed by digits then do following:
  add=length($NF)>prev?length($NF)-prev:prev-length($NF); ##Creating variable named add here where checking condition length($NF)>prev if it is true then do length($NF) or do prev-length($NF) here.
  len=length(val)+add;     ##Creating variable named len here whose value is addition of add and length of variable val.
  if(++i==1){              ##Checking condition if variable i value is 1 here then do following:
    print val};            ##Printing the value of variable val here.
  printf("%+"len"s\n",$0)  ##Using the printf to print the value of len to print spaces till the length of len and then printing current line.
}
' length_question          ##Mentioning the Input_file name here.

Output will be as follows.

1970-01-01T00|sdafasdfsladfsdfas|sadfljasdkfsf|record1:32423
                                               record2:45245
                                               record3:dasfas
1970-01-01T001|sdafasdfsladfsdfas1|sadfljasdkfsf1|record1:32423
                                                  record2:452451
                                                  record3:dasfas1

Solution 2nd: Which OP posted like record strings should start from the very first record string's place:

awk -v s1="'" -F' |:' '  ##creating a variable named s1 whose value is a single quote and creating field separator as space OR colon here.
$0 ~ s1{                 ##Checking if a line has variable s1(single quote) in it, if yes that do following:
  gsub(s1,"")            ##Globally substitute s1 with NULL in current line.
}
{
  sub(/^ +/,"")          ##Substituting initial space with NULL for every line in case it is present on it.
}
/^}/{                    ##Checking if a line is starting from } then do following:
   len=i=add=val=""      ##Nullifying variables named len,i,add,val here.
}
/Time/||/ID/||/Key/{     ##Checking a condition if a line has string(s) either Time or ID or Key in it then do following:
   val=val?val "|" $NF:$NF ##creating variable named val here whose value will be concatenating with its own value and with a pipe delimiter.
}
/^record /{              ##Checking condition here if a line starts from string record then do following:
   sub(/{/,"");          ##Substituting { with NULL in current line then.
   add=length($NF)>prev?length($NF)-prev:(prev?prev:prev-length($NF)); ##Creating a variable named add, where we will check if variable prev value is less than length of current lines last field if yes then have its value as length($NF)-prev OR have its value as prev(if it is not null) or have it prev-length($NF) in it.
   val=val?val "|" $2 ":"  $3:$2 ":" $3; ##Creating variable named val whose value will be concatenating itself with the values of $2 ":" $3 in it.
   if(!prev_val){        ##Checking condition if variable named prev_val is NULL then do following:
     prev_val=length(val)}; ##creating variable prev_val whose value is length of variable val here.
   prev=length($NF)      ##creating variable named prev whose value is length of last field of current line.
}
/^record[0-9]+/{         ##Checking condition here if a line starts from string record and followis with digits in it then do following:
   add=length($NF)>prev?length($NF)-prev:prev-length($NF);##Creating variable namd add here checking if length($NF)>prev then make its value as length($NF)-prev or makeits value as prev-length($NF).
   len=prev_val+add;     ##Creating variable named len whose value is addition of variables prev_val and add.
   if(++i==1){print val} ##checking if variable i value is 1 here if yes then print varaible val here.
   printf("%+"len"s\n",$0) ##Printing variable lens value in printf to get the enough spaces on line with a new line.
}
' length_question        ##mentioning the Input_file name here.

Output will be as follows:

1970-01-01T00|sdafasdfsladfsdfas|sadfljasdkfsf|record1:32423
                                               record2:45245
                                               record3:dasfas
1970-01-01T001|sdafasdfsladfsdfas1|sadfljasdkfsf1|record1:32423
                                               record2:452451
                                               record3:dasfas1

Upvotes: 2

Related Questions