Reputation: 1185
I'm trying to write a regular expression that matches all word inside a specific string, but skips words inside brackets. I currently have one regex that matches all words:
/[a-z0-9]+(-[a-z0-9]+)*/i
I also have a regex that matches all words inside brackets:
/\[(.*)\]/i
I basically want to match everything that the first regex matches, but without everything the second regex matches.
Sample input text: http://gist.github.com/222857 It should match every word separately, without the one in the brackets.
Any help is appreciated. Thanks!
Upvotes: 4
Views: 1774
Reputation: 82
This seems to work:
[^\[][a-z0-9]+(-[a-z0-9]+)*
if the first letter of a word is an opening bracket, it doesnt match it.
btw, is there a reason why you are capturing the words with dashes in them? If no need for that, your regex could be simplified.
Upvotes: -1
Reputation: 75232
Which Ruby version are you using? If it's 1.9 or later, this should do what you want:
/(?<![\[a-z0-9-])[a-z0-9]+(-[a-z0-9]+)*(?![\]a-z0-9-])/i
Upvotes: 1
Reputation: 15488
How 'bout this:
your_text.scan(/\[.*\]|([a-z0-9]+(?:-[a-z0-9]+)*)/i) - [[nil]]
Upvotes: 1
Reputation: 1121
I agree with Shhnap. Without more info, it sounds like the easiest way is to remove what you don't want. but it needs to be /[(.*?)]/ instead. After that you can split on \s.
If you are trying to iterate through each word, and you want each word to match maybe you can cheat a little with: string.split(/\W+/) .You will lose the quotations and what not, but you get each word.
Upvotes: 0
Reputation: 13477
I don't think I understand the question properly. Why not just make a new string that does not contain the second regex like so:
string1 =~ s/\[(.*)\]//g
Off the top of my head won't that match what you deleted while storing the result in string1? I have not tested this yet though. I might test it later.
Upvotes: 0
Reputation: 993125
Perhaps you could do it in two steps:
Using a single regular expression to try to do both these things will end up being more complicated than it needs to be.
Upvotes: 3