L457
L457

Reputation: 1032

Perl regex: How do you match multiple words in perl?

Im writing a small script which is supposed to match all strings within another file (words in between "" and '' including the "" and '' symbols as well).

Below is the regex statement i am currently using, however it only produces results for '(.*)' and not "(.*)"

my @string_matches = ($file_string =~ /'(.*)' | "(.*)"/g);

print "\n@string_matches";

Also how would I be able to include the "" or '' symbols in the results as well?(print out "string" instead of just string) I've tried searching online but couldnt find any material on this

$file_string is basically a string version of an entire file.

Upvotes: 0

Views: 5914

Answers (3)

Lee Duhem
Lee Duhem

Reputation: 15121

You could use '[^']*' to match a string between single quotes, "[^"]*" for double quotes.

If you want to support other features, such as escape sequence, then you should consider using modules Text::ParseWords or Text::Balanced.

Note:

  1. Because of the greediness of *, '.*' will match all characters between the first and last single quote, if your string has more than one single quoted substrings, this will only give one match instead of several ones.

  2. You can use ('[^']*') instead of '([^']*)' to capture the single quotes and the substring between them, double quotes are similar.

  3. Because '[^']*' and "[^"]*" cannot be matched at the same time, m/('[^']*')|("[^"]*")/ with /g will give some undefs in the returned list in list context, using m/('[^']*'|"[^"]*")/g can fix this problem.

Here is a test program:

#!/usr/bin/perl

use strict;
use warnings;

use feature qw(switch say);

use Data::Dumper;

my $file_string = q{Test "test in double quotes" test 'test in single quotes' and "test in double quotes again" test};
my @string_matches = ($file_string =~ /('[^']*'|"[^"]*")/g);

local $" = "\n";
print "@string_matches\n";

Testing:

$ perl t.pl 
"test in double quotes"
'test in single quotes'
"test in double quotes again"

Upvotes: 0

Pedro Lobito
Pedro Lobito

Reputation: 98881

#!/usr/local/bin/perl
open my $fh, '<', "strings.txt"; #read the content of the file and assign it to $string;
read $fh, my $string, -s $fh;
close $fh;

    while ($string =~ m/^['"]{1}(.*?)['"]{1,}$/mg) {
        print $&;
    }

Upvotes: 0

aelor
aelor

Reputation: 11116

use this : '(.*?)' | "(.*?)"

i guess the greedy operator is selecting your string upto the last '. make it lazy

IMHO use this regex :

['"][^'"]*?['"]

this will also solve your problem of not getting the quotes inside the match.

demo here : http://regex101.com/r/dI6gD7

Upvotes: 1

Related Questions