user2268507
user2268507

Reputation:

Regex to match vowels in order a-e-i-o-u

I am trying to match patterns where the vowels are in order, shouldn't repeat and can be separated by non-vowels.

Currently I am just testing it with a and e. However, I am not getting the output I expect.

grep 'a[^aeiou]*e[^aeiou]*'

So for example ae, abeb, abbbbbbeb should all match.

However, aeeb shouldn't match.

With my regex, why is this matching?

I would have thought the the first e in aeeb would match the e in a[^aeiou]*e[^aeiou]* and the rest of the regex would fail?

Thanks for your help.

Upvotes: 2

Views: 4525

Answers (4)

user2968675
user2968675

Reputation: 834

You should also have a non-vowel bracket expression in front to account for words starting with a character other than aeiou since you're allowing it at the end:

egrep '^[^aeiou]*a[^aeiou]*e[^aeiou]$'

Upvotes: 1

Sam
Sam

Reputation: 2200

I'm way too late to this, but for anyone stumbling upon this later, you can simply use:

grep -P "a.*e.*i.*o.*u" myfile.txt

In plain English, this is an 'a' followed by 0 or more of any other letter, followed by 'e', followed by 0 or more of any other letter and so on.

If you are searching through a dictionary like the ones in /usr/shares/dict/ this will do the same as the other working code snippets. However, it is worth mentioning that this will not match the entire word. It will only match from the first 'a' to the first 'u' (assuming the 'u' is last letter in the matching vowel sequence) leaving out the leading and trailing characters. For example, in 'facetious' only 'etiou' will be matched. If you want to match the whole word, you can modify the regex to this:

grep -P ".*a.*e.*i.*o.*u.*" myfile.txt

Note the '.*' at the beginning and end.

Hope this helps.

Upvotes: 1

hwnd
hwnd

Reputation: 70732

Use the beginning of string ^ and end of string $ anchors.

#!/bin/sh
STRING=$( cat <<EOF
ae
abeb
abbbbbbeb
aeeb
EOF
)
echo "$STRING" | grep '^a[^aeiou]*e[^aeiou]*$'
## ae
## abeb
## abbbbbbeb

Upvotes: 2

univerio
univerio

Reputation: 20538

a[^aeiou]*e[^aeiou]* matches the ae in aeeb.

If you want to make sure the line aeeb doesn't match, you have to anchor the regex:

^a[^aeiou]*e[^aeiou]*$

Upvotes: 2

Related Questions