Reputation: 41
I have been learning perl for the past two weeks. I have been writing some perl scripts for my school project. I need to parse a text file for multiple strings. I searched perl forums and got some information.The below function parses a text file for one string and returns a result. However I need the script to search the file for multiple strings.
use strict;
use warnings;
sub find_string {
my ($file, $string) = @_;
open my $fh, '<', $file;
while (<$fh>) {
return 1 if /\Q$string/;
}
die "Unable to find string: $string";
}
find_string('filename', 'string');
Now for instance if the file contains multiple strings with regular expressions as listed below
"testing"
http://www.yahoo.com =1
http://www.google.com=2
I want the function to search for multiple strings like
find_string('filename', 'string1','string2','string3');
Please can somebody explain me how i need to do that.It would be really helpful
Upvotes: 3
Views: 1543
Reputation: 107040
Going through this very quickly here:
You right now pass the name of a file, and one string. What if you pass multiple strings:
if ( find_string ( $file, @strings ) ) {
print "Found a string!\n";
}
else {
print "No string found\n";
}
..
sub find_string {
my $file = shift;
my @strings = @_;
#
# Let's make the strings into a regular expression
#
my $reg_exp = join "|" ,@strings; # Regex is $string1|$string2|$string3...
open my $fh, "<", $file or die qq(Can't open file...);
while ( my $line = <$fh> ) {
chomp $line;
if ( $line =~ $reg_exp ) {
return 1; # Found the string
}
}
return 0; # String not found
}
I am about to go into a meeting, so I haven't really even tested this, but the idea is there. A few things:
quotemeta
command, or use \Q
and \E
before and after each string.use autodie
to handle files that can't be open. Then, you don't have to check your open statement (like I did above).$fh
). Instead of opening your file via the subroutine, I would pass in a scalar file handle. This would allow you to take care of an invalid file issue in your main program. That's the big advantage of scalar file handles: They can be easily passed to subroutines and stored in class objects.#! /usr/bin/env perl
#
use strict;
use warnings;
use autodie;
use feature qw(say);
use constant {
INPUT_FILE => 'test.txt',
};
open my $fh, "<", INPUT_FILE;
my @strings = qw(foo fo+*o bar fubar);
if ( find_string ( $fh, @strings ) ) {
print "Found a string!\n";
}
else {
print "No string found\n";
}
sub find_string {
my $fh = shift; # The file handle
my @strings = @_; # A list of strings to look for
#
# We need to go through each string to make sure there's
# no special re characters
for my $string ( @strings ) {
$string = quotemeta $string;
}
#
# Let's join the stings into one big regular expression
#
my $reg_exp = join '|', @strings; # Regex is $string1|$string2|$string3...
$reg_exp = qr($reg_exp); # This is now a regular expression
while ( my $line = <$fh> ) {
chomp $line;
if ( $line =~ $reg_exp ) {
return 1; # Found the string
}
}
return 0; # String not found
}
autodie
handles issues when I can't open a file. No need to check for it.open
. This is the preferred way.$fh
which allows me to pass it to my find_string
subroutine. Open the file in the main program, and I can handle read errors there.@strings
and use the quotemeta
command to automatically escape special regular expression characters.$string
in my loop, it actually modifies the @strings
array.qr
to create a regular expression./foo|fo\+\*o|bar|fubar/
.fooburberry
will match with foo
. Do you want that, or do you want your strings to be whole words?Upvotes: 2
Reputation: 762
I think you can store the file content in an array first, then grep the input in the array.
use strict;
use warnings;
sub find_multi_string {
my ($file, @strings) = @_;
my $fh;
open ($fh, "<$file");
#store the whole file in an array
my @array = <$fh>;
for my $string (@strings) {
if (grep /$string/, @array) {
next;
} else {
die "Cannot find $string in $file";
}
}
return 1;
}
Upvotes: 0
Reputation: 5139
I'm happy to see use strict
and use warnings
in your script. Here is one basic way to do it.
use strict;
use warnings;
sub find_string {
my ($file, $string1, $string2, $string3) = @_;
my $found1 = 0;
my $found2 = 0;
my $found3 = 0;
open my $fh, '<', $file;
while (<$fh>) {
if ( /$string1/ ) {
$found1 = 1;
}
if ( /$string2/ ) {
$found2 = 1;
}
if ( /$string3/ ) {
$found3 = 1;
}
}
if ( $found1 == 1 and $found2 == 1 and $found3 == 1 ) {
return 1;
} else {
return 0;
}
}
my $result = find_string('filename', 'string1'. 'string2', 'string3');
if ( $result == 1 ) {
print "Found all three strings\n";
} else {
print "Didn't find all three\n";
}
Upvotes: 0