Reputation: 83
I should preface this by stating that I'm working with VB6 & RegExp
I'm attempting to find and substitute whole words, by "whole words" I mean a valid match is not a substring of another word, although some special characters will be ok. I'm a novice at regular expressions. This is what I was trying:
([^a-z]+)(Foo)([^a-z]+)
It seems close but I'm having some trouble in certain situations.
For example, if I find the string
Foo Foo
or
Foo(Foo)
or anywhere a line ends with Foo and the following line begins with Foo
This is a line with Foo
Foo starts the next line
In any of these cases only the first Foo is matched.
Well, maybe it isn't a problem with the match but rather my replace method. I don't know exactly how I can verify that. I'm using groups to replace whatever bounding char is matched by the expression, like so:
regEX.Replace(source, "$1" & newstring & "$3")
So in summary I want to avoid matching: FooBar BarFoo
Any of the following would be valid matches:
Foo Foo
Foo Bar
Foo_bar
Foo.bar
Foo, bar
Foo(bar)
Foo(Foo)
If anyone can kindly show me the proper way to do this I would much appreciate it!
edited
Looks like I spoke a little too soon regarding the first solution below. After a little testing and some further reading, I see that underscore is a word char and thus the above pattern won't match it. I came up with this which does the trick, is there a better way?
(\b)(Foo)(\b|_)
regEX.Replace(source, "$1" & newstring & "$3")
It works, but seems a little sloppy.
Upvotes: 7
Views: 66306
Reputation: 425073
Use the "word boundary" expression \b
.
Perhaps something as simple as this will do:
(.*)\bFoo\b(.*)
FYI, the word boundary expression \b
is a zero-width match between a word character \w
and a non-word character [^\w]
or visa versa, and consumes no input.
Underscore and digit characters are considered "word characters", so Foo_Bar
, Bar_Foo
, and Foo123
wouldn't match. To rectify that, so that any non-letter is considered "end of word" (including start and end of input), use look arounds:
(?i)(.*(?<![^a-z]))Foo((?![^a-z]).*)
Upvotes: 19