gitmorty
gitmorty

Reputation: 283

Perl one liner to simulate awk script

I'm new to both awk and perl, so please bear with me. I have the following awk script:

awk '/regex1/{p = 0;} /regex2/{p = 1;} p'

What this basically does is print all lines staring from line matching with regex2 until a line matching with regex1 is found.

Example:

 regex1
 regex2
 line 1
 line 2
 regex1
 regex2
 regex1

Output:

 regex2
 line 1
 line 2
 regex2

Is it possible to simulate this using a perl one-liner? I know I can do it with a script saved in a file.

Edit:

A practical example:

24 May 2017 17:00:06,827 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content

24 May 2017 17:00:06,828 [INFO] 567890 (Blah : Blah1) Service-name:: Content( May span multiple lines)

24 May 2017 17:00:06,829 [INFO] 123456 (Blah : Blah2) Service-name: Multiple line content. Printing Object[ ID1=fac-adasd ID2=123231
ID3=123108 Status=Unknown
Code=530007 Dest=CA
]

24 May 2017 17:00:06,830 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content

24 May 2017 17:00:06,831 [INFO] 567890 (Blah : Blah2) Service-name:: Content( May span multiple lines)

Given the search key 123456 I want to extract the following:

24 May 2017 17:00:06,827 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content

24 May 2017 17:00:06,829 [INFO] 123456 (Blah : Blah2) Service-name: Multiple line content. Printing Object[ ID1=fac-adasd ID2=123231
ID3=123108 Status=Unknown
Code=530007 Dest=CA
]

24 May 2017 17:00:06,830 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content

The following awk script does the job:
awk '/[0-9]{2}\s\w+\s[0-9]{4}/{n = 0} /123456/ {n =1}n' file

Upvotes: 0

Views: 833

Answers (2)

melpomene
melpomene

Reputation: 85767

perl -ne 'print if (/regex2/ .. /regex1/) =~ /^\d+$/'

This is slightly crazy, but here's how it works:

  • -n adds an implicit loop over the input lines
  • the current line is in $_
  • the two bare regex matches (/regex2/, /regex1/) implicitly test against $_
  • we use .. in scalar context, which turns it into a stateful flip-flop operator

    By that I mean: X .. Y starts out in the "false" state. In the "false" state it only evaluates X. If X returns a false value, it remains in the "false" state (and returns false itself). Once X returns a true value, it moves into the "true" state and returns true.

    In the "true" state it only evaluates Y. If Y returns false, it remains in the "true" state (and returns true itself). Once Y returns a true value, it moves into the "false" state but it still returns true.

  • had we just used print if /regex2/ .. /regex1/, it would have printed all the terminating regex1 lines, too

  • a close reading of Range Operators in perldoc perlop reveals that you can distinguish the end points of the range
  • the "true" value returned by .. is actually a sequence number starting from 1, so the start of a range can be identified by checking for 1
  • when the end of the range is reached (i.e. we're about to move from the "true" state to the "false" state again), the return value gets a "E0" tacked on to the end

    Adding "E0" to an integer doesn't affect its numeric value. Perl implicitly converts strings to numbers when needed, and something like "5E0" is just scientific notation (meaning 5 * 10**0, which is 5 * 1, which is 5).

  • the "false" value returned by .. is the empty string, ""

We check that the result of .. matches the regex /^\d+$/, i.e. is all digits. This excludes the empty string (because we require at least one digit to match), so we don't print lines outside of the range. It also excludes the last line in our range, because E is not a digit.

Upvotes: 3

stark
stark

Reputation: 13189

Not sure if awk prints both the start and end of the range, but Perl does:

perl -ne 'if(/regex2/ ... /regex1/){print}' file

Edit: Awk (at least Gnu awk) also has a range operator, so this could have been done more simply as:

awk '/regex2/,/regex1/' file

Upvotes: 0

Related Questions