Greg Bright
Greg Bright

Reputation: 9

Truncating the end of variables based on pattern

I have a list of URLs in formats such as "www.blah.com/en-us" and I need to cut-off anything after the "www.blah.com". I've tried using the following:

import re
website = www.blah.com/en-us
cleanURL = re.sub('(.|\n)*?com', "", website)

Output: 'en-us'

So I'm getting the opposite of what I want. Sorry if this post isn't correctly formatted, first time asking a question.

Upvotes: 0

Views: 57

Answers (2)

Fulgen
Fulgen

Reputation: 432

How about just using

website = "www.blah.com/en-us"
cleanURL = website.split("/",1)[0]

?

Upvotes: 4

Andrew Zick
Andrew Zick

Reputation: 603

Is using regex a must? If there's no protocol (e.g. http://) in the URLs that you're trying to process, you could just use your_url_string.split('/', 1)[0] which should split on the first instance of '/' and gives you the part before the split.

Upvotes: 2

Related Questions