Reputation: 66390
German apple is good. Apple is nice. Where is the apple? English apple have worms. English people are nice.
If I have an example as above, I would like to replace all occurrences of Apple
or apple
with German apple
. In the example above German apple
should be ignored as it is already correct. Only one exception, if there is English apple
then it should still be replaced to German apple
Using online regex generators, I came up with this:
It seems the following allows me to select all apples (globally, multilines)
But this also selects the German apple
, which I should disallow.
/\b(apple)/igm
This attempt doesn't really work either. It only selects apple?
.
/\b(apple)[^German apple]/igm
I am stuck here already. Would be grateful for a hint.
UPDATE:
I am looking at positive and negative lookaround as explained here.
If I had another line added to the above example:
Apple from Dutch is sour.
And I want to say ignore Apple from Dutch
as we did with German apple
.
How can this be achieved?
I tried this without luck:
(?i)(?:English )?(?:(?<!German )\bapple\b(?<! from dutch))
Upvotes: 2
Views: 287
Reputation: 174844
Use the below regex and then replace the match with German apple
(?<!German )(?:English )?\bapple\b
OR
(?i)(?:English )?(?:(?<!German )\bapple\b)
(?i)
Case insensitive modifier.(?:English )?
Matches the optional English
strings.(?:(?<!German )\bapple\b)
Matches all the apple
strings only if it's not preceded by German
. (?<!German )
negative lookbehind asserts that the string we are going to match won't be preceded by the string which was matched by the pattern present inside the negative lookbehind.Example:
>>> string = 'German apple is good. Apple is nice. Where is the apple? English apple have worms. English people are nice.'
>>> re.sub(r'(?i)(?:English )?(?:(?<!German )\bapple\b)', r'German apple', string)
'German apple is good. German apple is nice. Where is the German apple? German apple have worms. English people are nice.'
Update:
(?i)(?:English )?(?:(?<!German )\bapple\b)(?!\s+from\s+Dutch\b)
Upvotes: 2