Pawel Bala
Pawel Bala

Reputation: 93

AWK - how to selectively modify txt file

I would like to print particular 2nd field (that matches regex) of each record

awk '$2 ~ /regex1/'

BUT, ONLY specific records, that are between regex2 and regex3

awk '/regex2/,/regex3/'

other records, that are not between regex2 and regex3 shall be printed normally (all fields)

any ideas, how to put it together?

quick sample of input and output:

input

parrot   milana  3 ukraine
dog      husky   1 poland
cat      husky   5 france
elephant malamut 5 belgium
bird     husky   5 turkey

output: (show me

parrot   milana  3 ukraine
dog      husky   1 poland
         husky            
elephant malamut 5 belgium    
bird     husky   5 turkey
  1. Show entire input but:
  2. Between /dog/ and /elephant/ (show these records unchanged) show only 2nd field, which match regex /husky/

I hope this is usefull...

Upvotes: 3

Views: 194

Answers (3)

Ed Morton
Ed Morton

Reputation: 203607

This:

awk '/regex2/,/regex3/'

is shorthand for

awk '/regex2/{f=1} f; /regex3/{f=0}'

The shorthand version IMHO should NEVER be used as it's brevity isn't worth the difficulty it introduces when you try to build on it with other criteria, e.g. not printing the start line and/or not printing the end line and/or introducing other REs to match within the range as you're doing now.

Given that, you're starting with this script:

awk '/dog/{f=1} f; /elephant/{f=0}'

and you want to only print the lines where you find "husky" so it's the simple, obvious tweak:

awk '/dog/{f=1} f && /husky/; /elephant/{f=0}'

EDIT: in response to changed requirements, and using a tab-separated file:

$ cat file
parrot  milana  3       ukraine
dog     husky   1       poland
cat     husky   5       france
elephant        malamut 5       belgium
bird    husky   5       turkey

$ awk '
BEGIN{ FS=OFS="\t" }
/elephant/ {f=0}
{
   if (f) {
      if ($2 == "husky") {
         print "", $2
      }
   }
   else {
      print
   }
}
/dog/      {f=1}
' file
parrot  milana  3       ukraine
dog     husky   1       poland
        husky
elephant        malamut 5       belgium
bird    husky   5       turkey

You can write it more briefly:

$ awk '
BEGIN{ FS=OFS="\t" }
/elephant/ {f=0}
f && /husky/ { print "", $2 }
!f
/dog/      {f=1}
' file
parrot  milana  3       ukraine
dog     husky   1       poland
        husky
elephant        malamut 5       belgium
bird    husky   5       turkey

but I think the if-else syntax is clearest and easiest to modify for newcomers to awk. If you want different output formatting, look up "printf" in the manual.

Upvotes: 5

nullrevolution
nullrevolution

Reputation: 4137

infile:

$ cat input

parrot   milana  3 ukraine
dog      husky   1 poland
cat      husky   5 france
elephant malamut 5 belgium
bird     husky   5 turkey

command:

$ awk '/dog/{m=1} $2 ~ /husky/ && m{print $2} !m{print} /elephant/{m=0}' input

parrot   milana  3 ukraine
husky
husky
bird     husky   5 turkey

Upvotes: 1

sampson-chen
sampson-chen

Reputation: 47267

There are some ambiguities with your question, but this should do it:

awk '/regex2/ {inside=1}
     /regex3/ {inside=0}
     $2 ~ /regex1/ && inside {print $2}
     !inside {print}' input_file

Upvotes: 0

Related Questions