Reputation: 1581
I am 100% new to Perl but do have some PHP knowledge. I'm trying to create a quick script that will take the @url vars and save it to a .txt file. The problem that I'm having is that it's saving the url again everytime it runs through the loop which is super annoying. So when the loop runs, it'll look like this.
url1.com
url1.com url2.com
url1.com url2.com url3.com
What I would like it to look like is just a plain and simple:
url1.com
url2.com
url3.com
Here is my code. If anyone can help, I would appreciate it SO SO much!
#!/usr/bin/perl
use strict;
use warnings;
my $file = "data.rdf.u8";
my @urls;
open(my $fh, "<", $file) or die "Unable to open $file\n";
while (my $line = <$fh>) {
if ($line =~ m/<(?:ExternalPage about|link r:resource)="([^\"]+)"\/?>/) {
push @urls, $1;
}
open (FH, ">>my_urls.txt") or die "$!";
print FH "@urls ";
close(FH);
}
close $fh;
Upvotes: 3
Views: 234
Reputation: 106483
Shouldn't this part:
open (FH, ">>my_urls.txt") or die "$!";
print FH "@urls ";
close(FH);
...be placed outside of while
loop? It makes no sense within while
, as @urls
are apparently incomplete there.
And two regex-related sidenotes: first, with m
operator you may choose another set of delimiters so you don't have to escape /
sign; second, it's not necessary to escape "
sign within character class definition. In fact, it's not required to escape it in regex at all - unless you choose this character as a delimiter. )
So your regex may look like this:
$line =~ m#<(?:ExternalPage about|link r:resource)="([^"]+)"/?>#
Upvotes: 4
Reputation: 26871
do you need the @urls
array elsewhere? because else, you could simply:
#!/usr/bin/perl
use strict;
use warnings;
my $file = "data.rdf.u8";
my @urls;
open(my $fh, "<", $file) or die "Unable to open $file\n";
open (FH, ">>my_urls.txt") or die "$!";
while (my $line = <$fh>) {
if ($line =~ m/<(?:ExternalPage about|link r:resource)="([^\"]+)"\/?>/) {
print FH $1;
}
}
close(FH);
close $fh;
Upvotes: 2
Reputation: 823
Your print is inside your while loop. It sounds like you want to move your print outside of the loop.
Or if you want to print each url as you go through each line, move the declaration of "my @urls" down into the loop, then it will get reset each line
Upvotes: 8