Reputation: 5201
I am trying to remove all the vowels from a string except for the first and last character. I have tried with 2 expressions and using 2 ways but in vain. I have described them below. Does anybody has a regular expression for this?
e.g.
original string -- source = apeaple
after regex -- source_modified = apple (this is what is expected)
I tried the expression ([a-zA-Z])[aeiouAEIOU]([a-zA-Z])
but this expression is removing repeated character as well. So the following is happening when i apply the above expression
code used --
Regex reg = new Regex("([a-zA-Z])[aeiouAEIOU]([a-zA-Z])"); string source_modified = reg.Replace(source, "");
original string -- source = apeaple
after code execution -- source_modified = aple (repeating character removed)
code used -- string source_modified = Regex.Replace(source, "([a-zA-Z])[aeiouAEIOU]([a-zA-Z])", "$1" + "$2");
original string -- source = apeaple
after code execution -- source_modified = apaple (just 1 vowel gets removed)
i also tried ([a-zA-Z])[aeiouAEIOU]*([a-zA-Z])
but this is removing just 1 vowel and not all. So the following is happening when i apply the above expression
code used --
Regex reg = new Regex("([a-zA-Z])[aeiouAEIOU]*([a-zA-Z])"); string source_modified = reg.Replace(source, "");
original string -- source = apeaple
after code execution -- source_modified = "" (all characters are removed)
code used -- string source_modified = Regex.Replace(source, "([a-zA-Z])[aeiouAEIOU]*([a-zA-Z])", "$1" + "$2");
original string -- source = apeaple
after code execution -- source_modified = apeple
Upvotes: 3
Views: 7139
Reputation: 14099
You need some lookaround like so
(?<!^)[aouieyAOUIEY](?!$)
C# supports it and it's very powerful
string resultString = null;
try {
resultString = Regex.Replace(subjectString, "(?<!^)[aeui](?!$)", "");
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Update 1
T.W.R.Cole informs me that there is a special rule in the English language ("this doesn't work for words like "Anyanka" where an inner 'y' is used as a consonant")
The following change should do this, using the technique of negative lookahead:
(?<!^)([aouie]|y(?![aouie]))(?!$)
This time enable the regex modifier that matches case insensitive, it makes the regex simpler than the original
if a y followed by another y still means that the y is a consonant (euh... is there such a word) and thus should not disappear than a y must be listed in the last character class as well :
(?<!^)([aouie]|y(?![aouiey]))(?!$)
I repeat that I used C# as my regex dialect which has good support for lookaround techniques.
Upvotes: 7
Reputation: 4992
In case you ever want to apply that to individual words in strings that consist of more than one word, \B[AEIOUaeiou]\B
might be worth a try. \B
is a non-word-boundary, i.e. any location where the two adjacent characters are either both word characters or both non-word characters. The latter case is obviously not possible if there's a vowel between the two locations.
Needless to say it also works for strings consisting only of a single word.
Upvotes: 0
Reputation: 728
You need to start the string with at least one character, find a vowel and then end the string with at least one character. Try:
(.+)[aeiouAEIOU](.+)
Upvotes: 0
Reputation: 25619
If so, why not remove the 1st and last character, remove vowels, and then stitch up again?
string sWord = "apeaple";
char cFirst = sWord[0], cLast = sWord[sWord.length-1];
sWord = sWord.substring(1, sWord.length -2);
sWord = cFirst.ToString() +
Regex.Replace(sWord , "[aouiyeAOUIYE]", String.Empty) +
cLast.ToString();
Upvotes: 7