I cannot seem to get the trick to interchange the FS / RS variables dynamically, so that I get the following results from the input: Input_file header 1 header 2 { something should not be removed } 50 ( auto1 { type good; remove not useful; } auto2 { type good; keep useful; } auto3 { type moderate; remove not useful; } ) Output_file header 1 header 2 { something that should not be removed } 50 ( auto1//good { type good;//good } auto2//good { type good;//good keep useful; } auto3//moderate { type moderate;//moderate } ) The key things are: There's no change is happening when the code-block {...} is not preceded by a autoX (X can be 1,2,3 etc.). The changes should happen when autoX is followed by a codeblock {...} . the value inside the codeblock & autoX is modified with the addition of \\good or //moderate , which needs to be read from the {...} itself. the whole line should be removed from {...} , if it contains the phrase remove . HINT: It might be something that can use regex and the idea explained here , with this particular example . For now, I only have been able to meet the last requirement, with the following code: awk ' {$1=="{"; FS=="}";} {$1!="}"; gsub("remove",""); print NR"\t\t"$0}' Input_file Thanks in advance, for your skill & time, to tackle this problem with awk .

regexawkenvironment-variablestext-manipulation

Reputation: 1820

AWK: dynamically change FS or RS

I cannot seem to get the trick to interchange the FS/RS variables dynamically, so that I get the following results from the input:

Input_file

header 1
header 2
{
something should not be removed
}

50

( 
auto1
{
    type        good;
    remove      not useful;
}

 auto2
{
    type        good;
    keep        useful;
}

 auto3
{
    type        moderate;
    remove      not useful;
}
)

Output_file

header 1
header 2
{
something that should not be removed
}

50

( 
auto1//good
{
    type        good;//good
}

auto2//good
{
    type        good;//good
    keep        useful;
}

auto3//moderate
{
    type        moderate;//moderate
}
)

The key things are:

There's no change is happening when the code-block {...} is not preceded by a autoX (X can be 1,2,3 etc.).
The changes should happen when autoX is followed by a codeblock {...}.
the value inside the codeblock & autoX is modified with the addition of \\good or //moderate, which needs to be read from the {...} itself.
the whole line should be removed from {...}, if it contains the phrase remove.

HINT: It might be something that can use regex and the idea explained here, with this particular example.

For now, I only have been able to meet the last requirement, with the following code:

awk ' {$1=="{"; FS=="}";} {$1!="}"; gsub("remove",""); print NR"\t\t"$0}' Input_file

Thanks in advance, for your skill & time, to tackle this problem with awk.

Upvotes: 0

Answers (2)

Freddy

Reputation: 4688

You can use two newlines as record separator and process each record which may contain one

autoX
{
  ...
  ...
}

block.

awk '
BEGIN{
  RS="\n\n"                          # set record separator RS to two newlines
  a["good"]; a["moderate"]           # create array a with indices "good" and "moderate"
}                                    
{                                    
  sub(/\n[ \t]+remove[^;]+;/, "")    # remove line containing "remove xxx;"
  for (i in a){                      # loop array indices "good" and "moderate"
    if (index($0, i)){               # if value exists in record
      sub(i";", i";//"i)             # add "//good" to "good;" or "//moderate" to "moderate;"
      match($0, /(auto[0-9]+)/)      # get pos. RSTART and length RLENGTH of "autoX"
      if (RSTART){                   # RSTART > 0 ?
                                     # set prefix including "autox", "//value" and suffix
        $0=substr($0, 1, RSTART+RLENGTH-1) "//"i substr($0, RSTART+RLENGTH)
      }
      break                          # stop looping (we already replaced "autoX")
    }
  }
  printf "%s", (FNR==1 ? "" : RS)$0  # print modified line prefixed by RS if not the first line
}
' Input_file

Upvotes: 1

RavinderSingh13

Reputation: 133518

Here is my attempt to solve this problem:

awk '
FNR==NR{
  if($0~/auto[0-9]+/){
    found1=1
    val=$0
    next
  }
  if(found1 && $0 ~ /{/){
    found2=1
    next
  }
  if(found1 && found2 && $0 ~ /type/){
    sub(/;/,"",$NF)
    a[val]=$NF
    next
  }
  if($0 ~ /}/){
    found1=found2=val=""
  }
  next
}
found3 && /not useful/{
  next
}
/}/{
  found3=val1=""
}
found3 && /type/{
  sub($NF,$NF"//"a[val1])
}
/auto[0-9]+/ && $0 in a{
  print $0"//"a[$0]
  found3=1
  val1=$0
  next
}
1
'  Input_file  Input_file

Explanation: Adding detailed explanation for above code here.

awk '                                      ##Starting awk program from here.
FNR==NR{                                   ##FNR==NR will be TRUE when first time Input_file is being read.
  if($0~/auto[0-9]+/){                     ##Check condition if a line is having auto string followed by digits then do following.
    found1=1                               ##Setting found1 to 1 which makes sure that the line with auto is FOUND to later logic.
    val=$0                                 ##Storing current line value to variable val here.
    next                                   ##next will skip all further statements from here.
  }
  if(found1 && $0 ~ /{/){                  ##Checking condition if found1 is SET and line has { in it then do following.
    found2=1                               ##Setting found2 value as 1 which tells program further that after auto { is also found now.
    next                                   ##next will skip all further statements from here.
  }
  if(found1 && found2 && $0 ~ /type/){     ##Checking condition if found1 and found2 are ET AND line has type in it then do following.
    sub(/;/,"",$NF)                        ##Substituting semi colon in last field with NULL.
    a[val]=$NF                             ##creating array a with variable var and its value is last column of current line.
    next                                   ##next will skip all further statements from here.
  }
  if($0 ~ /}/){                            ##Checking if line has } in it then do following, which basically means previous block is getting closed here.
    found1=found2=val=""                   ##Nullify all variables value found1, found2 and val here.
  }
  next                                     ##next will skip all further statements from here.
}
/}/{                                       ##Statements from here will be executed when 2nd time Input_file is being read, checking if line has } here.
  found3=val1=""                           ##Nullifying found3 and val1 variables here.
}
found3 && /type/{                          ##Checking if found3 is SET and line has type keyword in it then do following.
  sub($NF,$NF"//"a[val1])                  ##Substituting last field value with last field and array a value with index val1 here.
}
/auto[0-9]+/ && $0 in a{                   ##Searching string auto with digits and checking if current line is present in array a then do following.
  print $0"//"a[$0]                        ##Printing current line // and value of array a with index $0.
  found3=1                                 ##Setting found3 value to 1 here.
  val1=$0                                  ##Setting current line value to val1 here.
  next                                     ##next will skip all further statements from here.
}
1                                          ##1 will print all edited/non0-edited lines here.
'  Input_file  Input_file                  ##Mentioning Input_file names here.

Upvotes: 2

AWK: dynamically change FS or RS

Answers (2)

Related Questions