Reputation: 2498
I have a file that looks something like this:
Random words go here
/attribute1
/attribute2
/attribute3="all*the*things*I'm*interested*in*are*inside*here**
and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*
bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sente
nce.*I*think*we*have*enough*words"
I want to grep the file for the line \attribute3=
then I want to save the string found inside the quotation marks to a separate variable.
Here's what I have so far:
#!/bin/perl
use warnings; use strict;
my $file = "data.txt";
open(my $fh, '<', $file) or die $!;
while (my $line = <$fh>) {
if ($line =~ /\/attribute3=/g){
print $line . "\n";
}
}
That's printing out /attribute3="all*the*things*I'm*interested*in*are*inside*here**
but
I want all*the*things*I'm*interested*in*are*inside*here**and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sentence.*I*think*we*have*enough*words
.
So what I did next is:
#!/bin/perl
use warnings; use strict;
my $file = "data.txt";
open(my $fh, '<', $file) or die $!;
my $part_I_want;
while (my $line = <$fh>) {
if ($line =~ /\/attribute3=/g){
$line =~ /^/\attribute3=\"(.*?)/; # capture everything after the quotation mark
$part_I_want .= $1; # the capture group; save the stuff on line 1
# keep adding to the string until we reach the closing quotation marks
next (unless $line =~ /\"/){
$part_I_want .= $_;
}
}
}
The code above doesn't work. How do I grep capture a multiline pattern between two characters (in this case it's quotation marks)?
Upvotes: 1
Views: 442
Reputation: 42743
From the command line:
perl -n0e '/\/attribute3="(.*)"/s && print $1' foo.txt
This is basically what you had, but the 0
flag is the equivalent of undef $/
within the code. From the man page:
-0[octal/hexadecimal]
specifies the input record separator ($/) as an octal or hexadecimal number. If there are no digits, the null character is the separator.
Upvotes: 1
Reputation: 6553
my $str = do { local($/); <DATA> };
$str =~ /attribute3="([^"]*)"/;
$str = $1;
$str =~ s/\n/ /g;
__DATA__
Random words go here
/attribute1
/attribute2
/attribute3="all*the*things*I'm*interested*in*are*inside*here**
and*it*goes*into*the*next*line.*blah*blah*blah*foo*foo*foo*foo*
bar*bar*bar*bar*random*words*go*here*until*the*end*of*the*sente
nce.*I*think*we*have*enough*words"
Upvotes: 2
Reputation: 2982
Read the entire file into a single variable and use /attribute3=\"([^\"]*)\"/ms
Upvotes: 1