Reputation: 4152
I am trying to scrape this page, but am having problems with the cookies using the below code:
SelectProxy.select_proxy()
local_proxy = SelectProxy.global_proxy
session = requests.Session()
session.proxies = {local_proxy}
cookies = session.cookies
url = movie_url
headers ={
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Cookie': cookies,
'Host': 'www.sky.com',
'If-Modified-Since': 'Sat, 18 Aug 2018 14:45:31 GMT',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'
}
r = session.get(url, headers=headers)
The error I am getting is as so:
Traceback (most recent call last):
File "G:\Python27\Kodi\Sky Q Movies Scraper.py", line 33, in <module>
class sky_movies:
File "G:\Python27\Kodi\Sky Q Movies Scraper.py", line 90, in sky_movies
r = session.get(url, headers=headers)
File "G:\Python27\lib\site-packages\requests\sessions.py", line 488, in get
return self.request('GET', url, **kwargs)
File "G:\Python27\lib\site-packages\requests\sessions.py", line 461, in request
prep = self.prepare_request(req)
File "G:\Python27\lib\site-packages\requests\sessions.py", line 394, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "G:\Python27\lib\site-packages\requests\models.py", line 295, in prepare
self.prepare_headers(headers)
File "G:\Python27\lib\site-packages\requests\models.py", line 409, in prepare_headers
check_header_validity(header)
File "G:\Python27\lib\site-packages\requests\utils.py", line 800, in check_header_validity
"not %s" % (value, type(value)))
InvalidHeader: Header value <RequestsCookieJar[]> must be of type str or bytes, not <class 'requests.cookies.RequestsCookieJar'>
Can anyone advise what I am doing wrong?
Thanks
Upvotes: 1
Views: 5504
Reputation: 603
Basically if requests receives any cookies from a server they are wrapped in a CookieJar object. Now you're trying that object in the header, which only accepts strings or bytes.
As heemayl rightfully remarks, usually the best way to work with cookies in requests is by passing them through the cookies parameter in any request function (e.g. get, post, head etc.).
If you want to pass your own cookies, you need to create a CookieJar object yourself, set the cookies on the jar, and pass that through the cookies parameter, as described here.
Upvotes: 2
Reputation: 42127
You are supposed to pass the cookies object via the cookies
parameter of requests.METHOD
call (e.g. get()
, post()
, head()
etc), not via header directly:
session.get(url, headers=headers, cookies=cookies)
and drop the Cookie
header altogether.
The cookies
object you have is an instance of the class requests.cookies.RequestsCookieJar
, you can check the attributes on the object via usual manners:
vars(cookies) # preferable
cookies.__dict__
and obviously can refer to attributes via the usual dotted lookup.
Upvotes: 2