Web scraping: are my cookies "not working" in my request?

Question

I'm very new to web scraping. I know nothing about cookies, which seem to be the problem here. I'm trying something very simple, i.e. doing a request.get() on some website, then playing with Beautiful Soup:

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.immoweb.be/fr/recherche/maison/a-vendre/brabant-wallon?minprice=100000&maxprice=200000&minroom=3&maxroom=20")
print page
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

This basically doesn't work, as the print(soup.prettify()) says: "Request unsuccessful. Incapsula incident ID: 449001030063484539-234265426366891642"

That's ok, I found out that it's because my get needs some cookies. So, I used the method described here, create a dict of the cookies, and passed it as an argument of my get:

cookies = {'incap_ses_449_150286':'ll/1bp9r6ifi7LPUDiw7Bi/dzlwAAAAAO6OR80W3VDDesKNGYZv4PA==', 'visid_incap_150286':'+Tg7VstMS1OzBycT4432Ey/dzlwAAAAAQUIPAAAAAAAqAettOJXSb8ocwxkzabRx'}
page = requests.get("https://www.immoweb.be/fr/recherche/maison/a-vendre/brabant-wallon?minprice=100000&maxprice=200000&minroom=3&maxroom=20", cookies=cookies)

...and now the print(soup.prettify()) prints the whole page, ok.

But, basically, if I shut down my computer, and come back the next day, and run my script again, it seems these cookies I hardcoded are now wrong, because they've actually changed, right? And this is what I observe, just re-running my script doesn't seem to work anymore. I guess this is normal 'cookies behavior', to change from one day to another (?).

So, I thought I might get these automatically, before doing my request.get(). So I did this:

session = requests.Session()
response = requests.get("https://www.immoweb.be/fr/recherche/maison/a-vendre/brabant-wallon?minprice=100000&maxprice=200000&minroom=3&maxroom=20")
cookies = session.cookies.get_dict()

When doing this, I do get 2 cookies (the 'incap_ses_449_150286', and the other), but with different values than what I see if I use Chrome's developers tools on the web page. And passing these cookies to my get() doesn't seem to work (although I don't have anymore the "Request unsuccessful" message, but the print(soup.prettify()) prints close to nothing.. The only way I have it working correctly is by manually encoding the cookies in the dict, by looking them using Chrome's tools... What am I missing?

Thanks a lot! Arnaud

Web scraping: are my cookies "not working" in my request?

Answers (1)

Related Questions

Web scraping: are my cookies &quot;not working&quot; in my request?

Answers (1)

Related Questions

Web scraping: are my cookies "not working" in my request?