Reputation: 49
I am writing a program which gives me the letters containing only the consonants in a webpage address between www. and .com. For example if I input www.google.com it should return me 'ggl' but that doesnt happen.
import re
x=int(raw_input())
for i in range(x):
inp1=raw_input()
y=re.findall('^www\.[^(aeiou)]+\.com',inp1)
print y
inp2=y[0]
print inp2
So what's the mistake in the line y=re.findall('^www\.[^aeiou]+\.com',inp1)
?
Upvotes: 1
Views: 95
Reputation: 78740
This can be done with a regex and you don't need a variable-width lookbehind to achieve it. You can use a negative lookahead:
>>> s = 'www.google.com'
>>> re.findall('(?!w{1,3}\.)([^aeiou\W])(?=.*\.com)', s)
['g', 'g', 'l']
Click here for a step-by-step explanation of the regex.
Upvotes: 1
Reputation: 30273
This is not possible with a regex. To find all matches while always checking for the preceding www.
, you'd need variable-width lookbehinds, which are illegal.
If they worked though, which, again, they do not, the following regex would have been what you were looking for:
y=re.findall('(?<=^www\..*)[^aeiou]+(?=.*?\.com)',inp1)
The answer however is simply that you cannot do what you're looking to do, with a regex.
Upvotes: 1