Reputation: 145
embed_url = 'http://www.vimeo.com/52422837'
response = re.search(r'^(http://)?(www\.)?(vimeo\.com/)?([\/\d+])', embed_url)
return response.group(4)
The response is:
5
I was hoping for
52422837
Anybody an idea? I'm really bad with regexes :S
Upvotes: 7
Views: 4168
Reputation: 39424
To get everything after the last slash (assuming there is one) the following simple regex should do it:
[^/]*$
(Greedily grabs everything up to the end that isn't a slash.)
Upvotes: 1
Reputation: 137584
Don't reinvent the wheel!
>>> import urlparse
>>> urlparse.urlparse('http://www.vimeo.com/52422837')
ParseResult(scheme='http', netloc='www.vimeo.com', path='/52422837', params='',
query='', fragment='')
>>> urlparse.urlparse('http://www.vimeo.com/52422837').path.lstrip("/")
'52422837'
Upvotes: 10
Reputation: 1122092
Use \d+
(no brackets) to match the literal slash + digits:
response = re.search(r'^(http://)?(www\.)?(vimeo\.com/)?(\d+)', embed_url)
Result:
>>> re.search(r'^(http://)?(www\.)?(vimeo\.com/)?(\d+)', embed_url).group(4)
'52422837'
You were using a character group ([...]
) where none was needed. The pattern [\/\d+]
matches exactly one of /
, +
or a digit.
Upvotes: 5