Reputation: 73
I have been working on this for a little while now and can't seem to figure it out. I have a file containing a bunch of lines all structured like the one below meaning each line starts with "!" and has three separators "<DIV>
".
!the<DIV>car<DIV>drove down the<DIV>road off into the distance
I am interested in retrieving the last string "road off into the distance" I can't seem to get it to work. Below I have listed the current code I have.
while($line = <INFILE>) {
$line =~ /<SEP>{3}(.*)/;
print $1;
}
Any help would be greatly appreciated!
Upvotes: 3
Views: 167
Reputation: 66891
I don't know whether you insist on regex or simply didn't think of else, but split
will nicely do this
$text = (split '<DIV>', $str)[-1];
If you regularly have such repeating patterns split
may well be better for the job than a pure regex. (Split also uses full regular expressions in its pattern, of course.)
ADDED
All this can be done directly, if you simply only need to pull the last thing from each line:
open my $fh, '<', $file;
my @text = map { (split '<DIV>')[-1] } <$fh>;
close $fh;
print "$_\n" for @text;
The split
by default uses $_
, which inside the map is the current element processed. For lines without a <DIV>
this returns the whole line. A file handle in the list context serves all lines as a list; the list context is imposed by map
here.
In case you want all text between delimiters you can do
my @rlines = map { [ split '<DIV>' ] } <$fh>;
where [ ]
takes a reference to the list returned by split
and thus @rlines
contains references to arrays, each with text in between <DIV>
s on a line. The leading !
is there though and to drop it a little more processing is needed.
Of course, for the map block you can use { (/.*<DIV>(.*)/)[0] }
from Jim Garrison's answer for a single match, or modify the regex a little to catch'em all.
If performance is a factor then that's a little different question.
Upvotes: 3
Reputation: 51
Simple regex which answers your question:
my $match= '';
while($line = <INFILE>) {
($match) = $line =~/.*<DIV>(.*)/;
}
print $match, "\n";
Upvotes: 0
Reputation: 6578
A simple substitution could work too:
while(<DATA>){
chomp;
my $text = (s/.*<DIV>//g, $_);
say $text;
}
Upvotes: 0
Reputation: 86774
The statement
@b = $a =~ /^!(.*?)<DIV>(.*?)<DIV>(.*?)<DIV>(.*)/
will split the string into a list, and you can then extract the 4th element with
$b[3]
If you really want only the last one, do this instead:
($text) = $a =~ /^!.*<DIV>(.*)/
Upvotes: 3