Reputation: 45
I was doing some experiment with regex in my learning process.
Input is : I am ironman and I was batman and I will be superman
I want to match all words except the word batman
I tried [^(batman)]+
but it doesn't match characters a,b,m,n,t
anywhere in string
How can I achieve it?
Upvotes: 4
Views: 68
Reputation: 30985
You can use the discard technique. For instance, you can use this pattern:
batman|(\w+)
Then you will have the words stored in the capturing group.
As you can see in the screenshot, words in green are captured using capturing groups while batman
in blue is discarded.
Match information:
MATCH 1
1. [0-1] `I`
MATCH 2
1. [2-4] `am`
MATCH 3
1. [5-12] `ironman`
MATCH 4
1. [13-16] `and`
MATCH 5
1. [17-18] `I`
MATCH 6
1. [19-22] `was`
MATCH 7
1. [30-33] `and`
MATCH 8
1. [34-35] `I`
MATCH 9
1. [36-40] `will`
MATCH 10
1. [41-43] `be`
MATCH 11
1. [44-52] `superman`
Another example of discard pattern can be if you want to discard batman
and superman
, so you can use:
batman|superman|(\w+)
Debuggex does a good job showing this:
Upvotes: 3
Reputation: 89547
Several ways are possible:
with a negative lookahead assertion (?!...)
(not followed by):
\b(?!batman\b)\w+
with a capture group (you must take in account only the capture group 1):
\b(?:batman\b|(\w+))
Why your pattern doesn't work:
You wrote [^(batman)]
but a character class is only a collection of characters without order, you can't describe substrings inside it. It is the same than [^abmnt()]
Upvotes: 4
Reputation: 59232
Okay, enough of bullshit, here's the code:
var words = input.split(" ").filter(function(str){
return str.toLowerCase() !== "batman";
});
Upvotes: 4