Reputation: 10790
I have a text file which contains names enclosed in single quotes. How do i do a regex to get all the names the text contains ?
- "Lady of Spain" (uncredited)
Music by 'Tolchard Evans' (qv)
Lyrics by 'Robert Hargreaves (II)' (qv), 'Stanley Damerell' (qv) and 'Henry B. Tilsley' (qv)
Performed by 'Jack Haig' (qv) and 'Kenneth Connor' (qv)
Here is what I could come up with.
/(\'(.*)\')*/
However, the period matches only till the newline. so i modified the regex to include
/(\'(.*)\'.*(\n|\r\n)*)*/
But its still not wokring. Please help me figure out why my regex isnt working.
Upvotes: 1
Views: 143
Reputation: 45662
I'd use split
instead:
#!/usr/bin/env perl
while (<DATA>) {
chomp();
@values = split(/('.*?')/);
foreach my $val (@values) {
print "$val\n" if ($val =~ m/^'/)
}
}
__DATA__
- "Lady of Spain" (uncredited)
Music by 'Tolchard Evans' (qv)
Lyrics by 'Robert Hargreaves (II)' (qv), 'Stanley Damerell' (qv) and 'Henry B. Tilsley' (qv)
Performed by 'Jack Haig' (qv) and 'Kenneth Connor' (qv)
outputs:
'Tolchard Evans'
'Robert Hargreaves (II)'
'Stanley Damerell'
'Henry B. Tilsley'
'Jack Haig'
'Kenneth Connor'
Upvotes: 3
Reputation: 67900
You do not need to match newline with those lines of input. I think your problem lies not so much with the regex, as with how you process your data. As long as your single quoted strings do not contain a newline, you do not need to compensate for that.
Try this one-liner, for example:
perl -nwE '$,="\n"; say /\'([^']+)\'/g;' quotes.txt
As you can see, I use the global option /g
to get all the matches from each line.
Further explanations:
-n
: assume a while (<>)
loop around the program (to get input from the file)say
)$,
: set the OUTPUT_FIELD_SEPARATOR to newline, so that all matches
are separated by newline.If you have the whole text file in a string, try this:
my @matches = $string =~ /'([^']+)'/g;
Upvotes: 1
Reputation: 1824
you can use this:
open FILE, "myfile" or die "Couldn't open file: $!";
#read file to sting
while (<FILE>){
$string .= $_;
}
close FILE;
#match regex with right order and put to array
while ($string =~ m/'(.*?)'/g) {
$hash{$1} = ++$i unless $hash{$1};
}
@array = sort {$hash{$a} <=> $hash{$b}} keys %hash;
#print array
foreach (@array) {
print $_ . "\n";
}
Upvotes: 0