Reputation: 23
I'm trying to replace the first occurrence of “I” in a word other than at the start of the word with “ee”. I'm using java.
This should change the phrase
INFINITY IS GIANT.
To:
INFeeNITY IS GeeANT.
So far, my code has gone through several revisions. One is:
replaceAll("(?<=[^I*])\\BI", "ee");
That is using lookbehind, I think. Help is very much appreciated! Thanks.
Upvotes: 2
Views: 126
Reputation: 167832
As you stated in the OP \\BI
finds the first I
character which is not at the start of the word - if the regular expression then matches the rest of the word, using (?:\\B.)*
or .*?\\b
, then it won't match a second I
in the same word.
"INFINITY IS GIANT".replaceAll( "\\BI((?:\\B.)*)", "ee$1");
"INFINITY IS GIANT".replaceAll( "\\BI(.*?\\b)", "ee$1");
Will both result in:
INFeeNITY IS GeeANT
It even works if you have accents in the text:
"IŇFINIŦŶ IS ĞIANŤ".replaceAll( "\\BI((?:\\B.)*)", "ee$1");
"IŇFINIŦŶ IS ĞIANŤ".replaceAll( "\\BI(.*?\\b)", "ee$1");
Both output:
IŇFeeNIŦŶ IS ĞeeANŤ
Alternatively
Using \\b(.(?:\\B.)*?)\\BI
can match from the start of the word to the first I
:
"INFINITY IS GIANT".replaceAll( "\\b(.(?:\\B.)*?)\\BI", "$1ee");
Outputs:
INFeeNITY IS GeeANT
Upvotes: 2
Reputation: 51330
If you don't care about accented letters, this pattern will do the trick:
\b([a-zA-Z][a-hj-zA-HJ-Z]*)[iI]
Replace it with $1ee
.
It matches the first letter of a word (\b[a-zA-Z]
) then any numer of letters except I
([a-hj-zA-HJ-Z]*
), then I
.
If you have to deal with accented letters, the pattern has to change somewhat:
(?<!\p{L})(\p{L}(?:(?![iI])\p{L})*)[iI]
Here, I used \p{L}
which means any Unicode letter, but had to write (?![iI])\p{L}
to mean any Unicode letter except I
. I also replaced \b
with (?<!\p{L})
to make sure I get Unicode support.
Upvotes: 0