dmx
dmx

Reputation: 1990

How to find the correct website link depending on a composed string with python

I have a list of first name and last that is supposed to be used to compose website links. But sometimes some users don't always follow the naming rule and finally, their website name doesn't match correctly with the expected one.

Here is an example: lest's say the name is John and last name is Paul. In this case, the website URL should be johnpaul.com. But sometimes, use put johnpaul.com or pauljohn.com, or john-paul.com.

I would like to automatize some processes on these websites. The vast majority of them are correct, but some not. When it is not correct, I just google the expected URL and it is generally the first or second result I get on google.

I was asking myself if it is possible to make a Google request and check the 2 or 3 first links with python to get the actual URL. Any idea on how to make something like this?

my code now looks like this:

for value in arr:

   try:
      print requests.get(url).status_code, url
   except Exception as e:
      print url, " is not available"

Upvotes: 1

Views: 50

Answers (1)

Roy Holzem
Roy Holzem

Reputation: 870

I'd go with endswith()

string = "bla.com"
strfilter = ('.com', '.de') # Tuple
if string.endswith(strfilter):
    raise "400 Bad Request"

this way you filter out the .com .net etc errors.

Upvotes: 1

Related Questions