Link with status code 200 redirects

Question

I have a link which has status code 200. But when I open it in browser it redirects.

On fetching the same link with Python Requests it simply shows the data from the original link. I tried both Python Requests and urllib but had no success.

How to capture the final URL and its data?
How can a link with status 200 redirect?

>>> url ='http://www.afaqs.com/news/story/52344_The-target-is-to-get-advertisers-to-switch-from-print-to-TV-Ravish-Kumar-Viacom18'
>>> r = requests.get(url)
>>> r.url
'http://www.afaqs.com/news/story/52344_The-target-is-to-get-advertisers-to-switch-from-print-to-TV-Ravish-Kumar-Viacom18'
>>> r.history
[]
>>> r.status_code
200

This is the link

Redirected link

Keyur Potdar · Accepted Answer

This kind of redirect is done by JavaScript. So, you won't directly get the redirected link using requests.get(...). The original URL has the following page source:

Here, you can see the redirected URL. Your job is to scrape that. You can do it using RegEx, or simply some string split operations.

For example:

r = requests.get('http://www.afaqs.com/news/story/52344_The-target-is-to-get-advertisers-to-switch-from-print-to-TV-Ravish-Kumar-Viacom18')
redirected_url = r.text.split('URL=')[1].split('">')[0]
print(redirected_url)
# http://www.afaqs.com/interviews/index.html?id=572_The-target-is-to-get-advertisers-to-switch-from-print-to-TV-Ravish-Kumar-Viacom18

r = requests.get(redirected_url)
# Start scraping from this link...

Or, using a regex:

redirected_url = re.findall(r'URL=(http.*)">', r.text)[0]

Link with status code 200 redirects

Answers (2)

Related Questions