Reputation: 27875
The world of vowel and around could be subjective, so I've these set of rules:
\n
, ,
(comma), .
(period) or
(space) are not part of the word.I have following string:
text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""
My try:
s = re.findall(r'[^aeiouAEIOU].*', text)
print(s)
Expectation:
['sntshk', 'xx', 'yy', 'zz']
Reality:
['line with every word a vowel', '\nsntshk xx yy.', '\nOkay zz fine.']
Related: Search all words with no vowels
Upvotes: 1
Views: 3433
Reputation: 363
This works:
text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""
q = ''
s = text.split()
for i in range(len(s)):
c = 0
s[i] = s[i].strip('.')
for c in range(len(s[i])):
if (s[i])[c].lower() in 'aeiou':
q += s[i]+' '
break
print(q)
Upvotes: 0
Reputation: 26039
There is a pure Python way you can do this without any imports:
[x.strip('.') for x in text.split() if all(y.lower() not in 'aeiou' for y in x)]
Example:
text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""
print([x.strip('.') for x in text.split() if all(y.lower() not in 'aeiou' for y in x)])
# ['sntshk', 'xx', 'yy', 'zz']
Upvotes: 1
Reputation: 521794
I would just target using the pattern \b[^AEIOU_0-9\W]+\b
in case insensitive mode:
text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""
re.findall(r'\b[^AEIOU_0-9\W]+\b', text, flags=re.I)
print(s)
['sntshk', 'xx', 'yy', 'zz']
The pattern [^\W]
in fact is a double negative, and means any word character. To this negative class we blacklist off vowels, digits, and underscore, leaving only consonants.
Upvotes: 2
Reputation: 370929
Use an ordinary character set composed of alphabetical characters, excluding the vowels, with word boundaries at each end:
(?i)\b[b-df-hj-np-tv-z]+\b
https://regex101.com/r/DqGuY1/1
(?i)
- Case-insensitive match\b
- Word boundary[b-df-hj-np-tv-z]+
- Repeat one or more of:
b-d
, or f-h
, or j-n
, or p-t
, or v-z
\b
- Word boundaryMore readably, but less elegantly, you could also use
(?i)\b(?:(?![eiou])[b-z])+\b
Upvotes: 2
Reputation: 37745
[^aeiouAEIOU]
This means match anything except aeiouAEIOU
so it will match characters other than alphabets too which is not required as you want to get words only,
so simply match all the alphabets other than vowels
\b[bcdfghjklmnpqrstvwxyz]+\b
Upvotes: 1