Reputation: 3580
I have the following Pig Latin filter:
filtered = FILTER raw BY year >= 1960 AND string MATCHES '(?!.*[0-9].*|.{1}|.*@.*|.*www.*|.*http.*)';
I was intending to get the following results for the following strings:
a #false .{1}
[email protected] #false .*@.*
http://somesite.com #false .*http.*
www.somesite.com #false .*www.*
12word #false .*[0-9].*
wo12rd #false .*[0-9].*
word12 #false .*[0-9].*
red #true
Instead, I get an empty result set.
EDIT: I've updated the regex to:
'^(?!.*[0-9].*|.{1}|.*@.*|.*www.*|.*http.*)$'
after m.buettner's correction, but continue to get an empty result set.
Upvotes: 1
Views: 5083
Reputation: 44259
There are two problems. Firstly it seems like Pig Latin requires you to match the full string instead of "just a match somewhere within the string". But you negative lookahead does not consume any characters, so it does not match the full the string. This could simply be resolved by appending .*
. Secondly your rule .{1}
(where {1}
is redundant) does not require this one character to be the only character in the string. So in your last example, it will simply consume the r
of red
and set off the negative lookahead.
Thus, here is the solution:
(?!.*[0-9]|.$|.*@|.*www|.*http).*
Upvotes: 1