ColeWorld
ColeWorld

Reputation: 279

Anyway to scrape a link that redirects?

Is there anyway that I can make python click a link such as a bit.ly link and then scrape the resulting link? When I am scraping a certain page, the only link I can scrape is a link that redirects, where it redirects to is where the information I need is located.

Upvotes: 4

Views: 7418

Answers (1)

furas
furas

Reputation: 142734

There are 3 types of redirections

  • HTTP - as information in response headers (with code 301, 302, 3xx)
  • HTML - as tag <meta> in HTML (wikipedia: Meta refresh)
  • JavaScript - as code like window.location = new_url

requests execute HTTP redirections and keep all urls in r.history

import requests

r = requests.get('http://' + 'bit.ly/english-4-it')

print(r.history)
print(r.url)

result:

[<Response [301]>, <Response [301]>]
http://helion.pl/ksiazki/english-4-it-praktyczny-kurs-jezyka-angielskiego-dla-specjalistow-it-i-nie-tylko-beata-blaszczyk,anginf.htm

BTW: SO doesn't let put bitly link in text so I used concatenation.

Upvotes: 11

Related Questions