Twistar
Twistar

Reputation: 782

Perl regex, grabbing text between tags

I have a large file that looks something like this:

<Feed stack_overflow>
   sourceid 32456
   prefeed 1
   <LOG>
     level 1
     cache info
  </LOG>
</Feed>

I want to do a search for anything in this file, and retrieve everything included the Feed tags. So if i do a search for 32456 i will get everything in the code above.

The code i have now is:

#!/usr/bin/perl
my $input = "<Feed stack_overflow"; #Search string
my $end = "</Feed>"; #End string
open (DATA, "file.config") or die "Error";

my @list = grep /\b$input\b(.*?)\b$end\b/, <DATA>;
chomp @list;
print "$_\n foreach @list;

But i don't get any results, even tough I know what i search for exist. I have successfully managed to print out every line containing a specific string with this regex:

my @list = grep /\b$input\b/, <DATA>;

But i need help on printing out everything between two tags.

Upvotes: 0

Views: 2554

Answers (2)

choroba
choroba

Reputation: 241968

Your regular expression works with the data line by line, but your string spans over several lines. You can use the range operator:

while (<$DATA>) {
    print if /$input/ .. /$end/;
}

If you want to exclude the border lines, you can change the inner line to

print if (/$input/ .. /$end/) !~ /^1$|E0/}

DATA is a predefined file handle. Consider using a different name, or use a lexical file handle (as $DATA in my example).

Upvotes: 4

Orab&#238;g
Orab&#238;g

Reputation: 12002

#!/usr/bin/perl
my $input = "<Feed stack_overflow"; #Search string
my $end = "</Feed>"; #End string
open (DATA, "file.config") or die "Error";

undef $/; # slurp mode
$_=<DATA>;
close DATA;

@list = m/\b$input\b(.*?)\b$end\b/mg;
map { print "found : $_\n" } @list;

(several edits due to errors in the original code)

Upvotes: 0

Related Questions