Reputation: 21
I tried to make a regex to match a string ending with the word 'Remixes' but only when it is not preceded by certain words and characters. I came up with the following regex with different results but both doesn't match it perfectly:
^(\w+)((?!\&|\+|And|The|Of|Various|House|Unreleased|Selected).)\s(Remixes)$
This excludes all the keywords in the string but not when it contains multiple words like: Think Twice Remixes or when it has one preceding word like: Various Remixes
^(.*)((?!\&|\+|And|The|Of|Various|House|Unreleased|Selected).)\s(Remixes)$
This excludes the following testexample: Fill Me Up + Remixes but not other examples with the excluded keywords, like Sides & Remixes
How can i make the first string match string with multiple preceding words and not match it if the exclude word is the only and first preceding word?
Upvotes: 2
Views: 620
Reputation: 53508
Honestly, I wouldn't. regex
is a powerful tool, and you can do a lot of things with it, but your code becomes much simpler and clearer when you don't try to "single-regex" every problem.
For your example, I would be quite tempted to use perl's grep
function, which lets you specify compound conditions:
my @filtered = grep { m/Remixes$/
and not
m/(And
|The
|Of
|Various
|House
|Unreleased
|Selected
)\s*.?\s+Remixes/xi } @list_of_things
E.g.:
#!/usr/bin/env perl
use strict;
use warnings;
#set up a list of words to exclude when prefixing "Remix"
#qw is perl's "quote words" and lets you specify whitespace delimited values.
my @exclude_remix_prefix = qw ( And
The
Of
Various
House
Unreleased
Selected );
#turn that into a sub regex (qr 'compiles' a regex).
my $exclude = join( "|", @exclude_remix_prefix );
$exclude = qr/($exclude)\s+Remixes/i;
#read from the <DATA> filehandle,
#but you could use <> to read from STDIN/filenames like 'sed/grep' do.
my @filtered = grep { m/Remixes$/i and not m/$exclude/i; } <DATA>;
print @filtered;
__DATA__
Fill Me Up + Remixes
Sides & Remixes
Something Selected remixes
Output:
Fill Me Up + Remixes
Sides & Remixes
(Give me some samples of what should/shouldn't be matched, and I will expand)
We're probably straying a bit from your original use case, but if you want to create a transform pattern:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my @exclude_remix_prefix = qw ( And
The
Of
Various
House
Unreleased
Selected );
my $exclude = join( "|", @exclude_remix_prefix );
$exclude = qr/($exclude)\s+Remixes/i;
my %transform = map { m/$exclude/ ? () : m/(.*)/ => m/(.*)\s+Remixes/ ; } <DATA>;
print Dumper \%transform;
__DATA__
Euterpeh Remixes
The Beauty And The Beast Remixes
Think Twice Remixes
Stop And Reset Remixes
This generates specifically a hash containing:
$VAR1 = {
'The Beauty And The Beast Remixes' => 'The Beauty And The Beast',
'Think Twice Remixes' => 'Think Twice',
'Euterpeh Remixes' => 'Euterpeh',
'Stop And Reset Remixes' => 'Stop And Reset'
};
Which you could perhaps use to generate a sequence of rename operations?
Or if you just want to 'in place' some operation, then a for
loop:
for ( <DATA> ) {
chomp;
next if m/$exclude/;
print "rename ", m/(.*)\s+Remixes/, " ", m/(.*)/,"\n";
}
(OK, I know 'rename' isn't quite what you want to do, but ...)
Upvotes: 1