Reputation: 49
In perl, I would like to substitute a negated class character set (everything but the pattern) by nothing, to keep only the expected string. Normally, this approach should work, but in my case it isn't :
$var =~ s/[^PATTERN]//g;
the original string:
$string = '<iframe src="https://foo.bar/embed/b74ed855-63c9-4795-b5d5-c79dd413d613?autoplay=1&context=cGF0aD0yMSwx</iframe>';
wished pattern to get: b74ed855-63c9-4795-b5d5-c79dd413d613
(5 hex number groups split with 4 dashes)
my code:
$pattern2keep = "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}";
(should match only : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (5 hex number groups split with 4 dashes) , char length : 8-4-4-4-12 )
The following should substitute everything but the pattern by nothing, but in fact it does not.
$string =~ s/[^$pattern2keep]//g;
What am I doing wrong please? Thanks.
Upvotes: 1
Views: 1502
Reputation: 126722
A character class matches a single character equal to any one of the characters in the class. If the class begins with a caret then the class is negated, so it matches any one character that isn't any of the characters in the class
If $pattern2keep
is [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}
then [^$pattern2keep]
will match any character other than -
, 0
, 1
, 2
, 4
, 8
, 9
, [
, ]
, a
, f
, {
, or }
You need to capture the substring, like this
use strict;
use warnings 'all';
use feature 'say';
my $string = '<iframe src="https://foo.bar/embed/b74ed855-63c9-4795-b5d5-c79dd413d613?autoplay=1&context=cGF0aD0yMSwx</iframe>';
my $pattern_to_keep = qr/ \p{hex}{8} (?: - \p{hex}{4} ){3} - \p{hex}{12} /x;
my $kept;
$kept = $1 if $string =~ /($pattern_to_keep)/;
say $kept // 'undef';
b74ed855-63c9-4795-b5d5-c79dd413d613
Upvotes: 8