Reputation: 13
I am working on a web scraping project using python and beautifulsoup. I want to navigate 1000+ URL's and extract the published month of that particular issue.
So far I have tried the following code, but it is leading to an error. I'm a sort of new to web scraping.
from bs4 import BeautifulSoup
import requests
page = requests.get("https://academic.oup.com/cesifo/issue/64/3?browseBy=volume")
thread.sleep(5)
soup = BeautifulSoup(page.content, 'html.parser')
The error is:
requests.exceptions.ConnectionError: ('Connection aborted.', OSError("(10054, 'WSAECONNRESET')"))
Kindly suggest a way through this.
Upvotes: 0
Views: 73
Reputation: 22440
Try using headers to get that specific content from that site. I'm not quite sure whether this output you want to grab actually. However, the fix here is to use headers.
from bs4 import BeautifulSoup
import requests
url = "https://academic.oup.com/cesifo/issue/64/3?browseBy=volume"
page = requests.get(url,headers={"User-Agent":"Mozilla/5.0"})
soup = BeautifulSoup(page.content, 'html.parser')
oDate = soup.select_one("h1 > .issue-info-pub").text
print(oDate)
Output:
Volume 64, Issue 3, September 2018
Upvotes: 1