Reputation: 5804

How to use regular expressions to only capture a word by itself rather than in another word?

How to use regular expressions to only capture a word by itself rather than the word inside another word?

For example, I'd like to replace only the "Co" within "Company & Co."

import re
re.subn('Co','',"Company & Co")
>>('mpany & ', 2) #which i dont want
>> "Company & "#Desired Result

Upvotes: 0

Answers (4)

ysakamoto

Reputation: 2532

"Word itself" means that the word is spanned by spaces or beginning/end of the sentence. So...

re.subn('(\s|^)Co(\s|$)','\g<1>\g<2>',"Company & Co")

Upvotes: 3

Tomalak

Reputation: 338228

You want word boundaries.

They are expressed with \b in most regex dialects (and with \< and \> in some). Python uses \b.

import re
re.subn(r'\bCo\b', '', "Company & Co")

note the r in front of the pattern.

Upvotes: 3

bgporter

Reputation: 36514

Use the r"\b" expression to match the empty string at the beginning or end of what you're looking for to ensure that it's a whole word and not part of another word:

>>> import re
>>> pat1 = re.compile("Co")
>>> pat2 = re.compile(r"\bCo\b")
>>> pat1.match("Company")
<_sre.SRE_Match object at 0x106b92780>
>>> pat2.search("Company")
# (fails)
>>> pat2.search("Co")
<_sre.SRE_Match object at 0x106b927e8>
>>> pat2.search("Co & Something")
<_sre.SRE_Match object at 0x106b92780> # succeeds

This syntax works whether the boundary between what you're looking for is:

white space
beginning of string
end of string

Upvotes: 0

abhishekgarg

Reputation: 1473

what about this

import re
print re.subn('Co$','',"Company & Co")

these are called metacharacters, that are very useful and worth looking at.

Upvotes: 1

How to use regular expressions to only capture a word by itself rather than in another word?

Answers (4)

Related Questions