Match a pattern between the www. and .com parts of a web address

Question

I am writing a program which gives me the letters containing only the consonants in a webpage address between www. and .com. For example if I input www.google.com it should return me 'ggl' but that doesnt happen.

import re

x=int(raw_input())

for i in range(x):
    inp1=raw_input()
    y=re.findall('^www\.[^(aeiou)]+\.com',inp1)
    print y
    inp2=y[0]
    print inp2

So what's the mistake in the line y=re.findall('^www\.[^aeiou]+\.com',inp1)?

timgeb · Accepted Answer

This can be done with a regex and you don't need a variable-width lookbehind to achieve it. You can use a negative lookahead:

>>> s = 'www.google.com'
>>> re.findall('(?!w{1,3}\.)([^aeiou\W])(?=.*\.com)', s)
['g', 'g', 'l']

Click here for a step-by-step explanation of the regex.

Match a pattern between the www. and .com parts of a web address

Answers (2)

Related Questions