Reputation: 31
I was scraping a secured website for my practice project but while doing so I faced this error:
sock.settimeout(timeout)
TypeError: an integer is required (got type dict)
My code is-
>> import urllib.request
>>> import bs4
>>> from urllib.request import urlopen as uReq
>>> from bs4 import BeautifulSoup as soup
>>> headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
>>> my_url = uReq('https://www.justdial.com/Mumbai/311/B2b_fil', None, headers)
The whole error I got is:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python\Python36-32\lib\urllib\request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "C:\Python\Python36-32\lib\urllib\request.py", line 526, in open
response = self._open(req, data)
File "C:\Python\Python36-32\lib\urllib\request.py", line 544, in _open
'_open', req)
File "C:\Python\Python36-32\lib\urllib\request.py", line 504, in _call_chain
result = func(*args)
File "C:\Python\Python36-32\lib\urllib\request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Python\Python36-32\lib\urllib\request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "C:\Users\Python\Python36-32\lib\http\client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Python\Python36-32\lib\http\client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Python\Python36-32\lib\http\client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Python\Python36-32\lib\http\client.py", line 1026, in _send_output
self.send(msg)
File "C:\Python\Python36-32\lib\http\client.py", line 964, in send
self.connect()
File "C:\Python\Python36-32\lib\http\client.py", line 1392, in connect
super().connect()
File "C:\Python\Python36-32\lib\http\client.py", line 936, in connect
(self.host,self.port), self.timeout, self.source_address)
File "C:\Python\Python36-32\lib\socket.py", line 710, in create_connection
sock.settimeout(timeout)
TypeError: an integer is required (got type dict)
Upvotes: 0
Views: 2928
Reputation: 1691
Create request before sending.Please look at the solution.
import urllib.request
import urllib
def main():
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
url = 'https://www.justdial.com/Mumbai/311/B2b_fil'
req = urllib.request.Request(url, None, headers)
response = urllib.request.urlopen(req)
print(response.read())
#my_url = uReq(url, None, headers)
if __name__ == '__main__':
main()
Upvotes: 1
Reputation: 1433
Use a requests module instead it would be easier like below.
import bs4
import requests
from bs4 import BeautifulSoup as soup
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
my_url = requests.get('https://www.justdial.com/Mumbai/311/B2b_fil', headers=headers)
If you really want to use the urllib, then it would be something below
from urllib.request import Request, urlopen
import bs4
from bs4 import BeautifulSoup as soup
request = Request('http://api.company.com/items/details?country=US&language=en')
request.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36')
response = urlopen(request).read()
Upvotes: 2