Reputation: 806
Hi I want to search something in the file which looks similar to this :
Start Cycle
report 1
report 2
report 3
report 4
End Cycle
.... goes on and on..
I want to search for "Start Cycle" and then pull out report 1 and report 3 from it.. My regex looks something like this
(Start Cycle .*\n)(.*\n)(.*\n)(.*\n)
The above regex select Start Cycle and the next three lines.. But i want to omit the thrid line from my result. Is that possible? Or any easier perl script can be done?? I am expecting a result like :
Start Cycle
report 1
report 3
Upvotes: 1
Views: 404
Reputation: 1851
I took the OP's question as a Perl exercise and came up with the following code. It was just written for learning purposes. Kindly correct me if anything looks suspicious.
while(<>) {
if(/Start Cycle/) {
push @block,$_;
push @block, scalar<> for 1..3;
print @block[0,1,3];
@block=();
}
}
Another version (edited and thanks,@FM):
local $/;
$_ = <>;
@block = (/(Start Cycle\n)(.+\n).+\n(.+\n)/g);
print @block;
Upvotes: 1
Reputation: 342413
while (<>) {
if (/Start Cycle/) {
print $_;
$_ = <>;
print $_;
$_ = <>; $_ = <>;
print $_;
}
}
Upvotes: 0
Reputation: 118128
Update: I did not originally notice that this was just @FM's answer in a slightly more robust and longer form.
#!/usr/bin/perl
use strict; use warnings;
{
local $/ = "End Cycle\n";
while ( my $block = <DATA> ) {
last unless my ($heading) = $block =~ /^(Start Cycle\n)/g;
print $heading, ($block =~ /([^\n]+\n)/g)[1, 3];
}
}
__DATA__
Start Cycle
report 1
report 2
report 3
report 4
End Cycle
Output:
Start Cycle report 1 report 3
Upvotes: 0
Reputation: 42411
Perhaps a crazy way to do it: alter Perl's understanding of an input record.
$/ = "End Cycle\n";
print( (/(.+\n)/g)[0,1,3] ) while <$file_handle>;
Upvotes: 2
Reputation: 239930
If you wanted to leave all of the surrounding code the same but stop capturing the third thing, you could simply remove the parens that cause that line to be captured:
(Start Cycle .*\n)(.*\n).*\n(.*\n)
Upvotes: 2
Reputation: 53966
The following code prints the odd-numbered lines between Start Cycle
and End Cycle
:
foreach (<$filehandle>) {
if (/Start Cycle/ .. /End Cycle/) {
print if /report (\d+)/ and $1 % 2;
}
}
Upvotes: 5
Reputation: 648
The regex populates $1, $2, $3 and $4 with the contents of each pair of brackets.
So if you just look at the contents of $1, $2 and $4 you have what you want.
Alternatively you can just leave off the brackets from the third line.
Your regex should look something like
/Start Cycle\n(.+)\n.+\n(.+)\n.+\nEnd Cycle/g
The /g will allow you to evaluate the regex repeatedly and always get the next match every time.
Upvotes: 1
Reputation: 28723
You can find text between start and end markes then split context by lines. Here is example:
my $text = <<TEXT;
Start Cycle
report 1
report 2
report 3
report 4
End Cycle
TEXT
## find text between all start/end pairs
while ($text =~ m/^Start Cycle$(.*?)^End Cycle$/msg) {
my $reports_text = $1;
## remove leading spaces
$reports_text =~ s/^\s+//;
## split text by newlines
my @report_parts = split(/\r?\n/m, $reports_text);
}
Upvotes: 2