Reputation: 283
I'm new to both awk
and perl
, so please bear with me.
I have the following awk
script:
awk '/regex1/{p = 0;} /regex2/{p = 1;} p'
What this basically does is print all lines staring from line matching with regex2 until a line matching with regex1 is found.
Example:
regex1
regex2
line 1
line 2
regex1
regex2
regex1
Output:
regex2
line 1
line 2
regex2
Is it possible to simulate this using a perl
one-liner? I know I can do it with a script saved in a file.
Edit:
A practical example:
24 May 2017 17:00:06,827 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
24 May 2017 17:00:06,828 [INFO] 567890 (Blah : Blah1) Service-name:: Content( May span multiple lines)
24 May 2017 17:00:06,829 [INFO] 123456 (Blah : Blah2) Service-name: Multiple line content. Printing Object[ ID1=fac-adasd ID2=123231
ID3=123108 Status=Unknown
Code=530007 Dest=CA
]24 May 2017 17:00:06,830 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
24 May 2017 17:00:06,831 [INFO] 567890 (Blah : Blah2) Service-name:: Content( May span multiple lines)
Given the search key 123456 I want to extract the following:
24 May 2017 17:00:06,827 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
24 May 2017 17:00:06,829 [INFO] 123456 (Blah : Blah2) Service-name: Multiple line content. Printing Object[ ID1=fac-adasd ID2=123231
ID3=123108 Status=Unknown
Code=530007 Dest=CA
]24 May 2017 17:00:06,830 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
The following awk script does the job:
awk '/[0-9]{2}\s\w+\s[0-9]{4}/{n = 0} /123456/ {n =1}n' file
Upvotes: 0
Views: 833
Reputation: 85767
perl -ne 'print if (/regex2/ .. /regex1/) =~ /^\d+$/'
This is slightly crazy, but here's how it works:
-n
adds an implicit loop over the input lines$_
/regex2/
, /regex1
/) implicitly test against $_
we use ..
in scalar context, which turns it into a stateful flip-flop operator
By that I mean: X .. Y
starts out in the "false" state. In the "false" state it only evaluates X
. If X
returns a false value, it remains in the "false" state (and returns false itself). Once X
returns a true value, it moves into the "true" state and returns true.
In the "true" state it only evaluates Y
. If Y
returns false, it remains in the "true" state (and returns true itself). Once Y
returns a true value, it moves into the "false" state but it still returns true.
had we just used print if /regex2/ .. /regex1/
, it would have printed all the terminating regex1
lines, too
perldoc perlop
reveals that you can distinguish the end points of the range..
is actually a sequence number starting from 1
, so the start of a range can be identified by checking for 1
when the end of the range is reached (i.e. we're about to move from the "true" state to the "false" state again), the return value gets a "E0"
tacked on to the end
Adding "E0"
to an integer doesn't affect its numeric value. Perl implicitly converts strings to numbers when needed, and something like "5E0"
is just scientific notation (meaning 5 * 10**0
, which is 5 * 1
, which is 5
).
..
is the empty string, ""
We check that the result of ..
matches the regex /^\d+$/
, i.e. is all digits. This excludes the empty string (because we require at least one digit to match), so we don't print lines outside of the range. It also excludes the last line in our range, because E
is not a digit.
Upvotes: 3
Reputation: 13189
Not sure if awk prints both the start and end of the range, but Perl does:
perl -ne 'if(/regex2/ ... /regex1/){print}' file
Edit: Awk (at least Gnu awk) also has a range operator, so this could have been done more simply as:
awk '/regex2/,/regex1/' file
Upvotes: 0