Reputation: 13434
I have a piece of Python 3 code that fetches a webpage every 10 seconds which gives back some JSON information:
s = requests.Session()
while True:
r = s.get(currenturl)
data = r.json()
datetime = data['Timestamp']['DateTime']
value = data['PV']
print(str(datetime) + ": " + str(value) + "W")
time.sleep(10)
The output of this code is:
2020-10-13T13:26:53: 888W
2020-10-13T13:26:53: 888W
2020-10-13T13:26:53: 888W
2020-10-13T13:26:53: 888W
As you can see, the DateTime does not change with every iteration. When I refresh the page manually in my browser it does get updated every time.
I have tried adding
Cache-Control max-age=0
to the headers of my request but that does not resolve the issue.
Even when explicitely setting everything to None after loop, the same issue remains:
while True:
r = s.get(currenturl, headers={'Cache-Control': 'no-cache'})
data = r.json()
datetime = data['Timestamp']['DateTime']
value = data['PV']
print(str(datetime) + ": " + str(value) + "W")
time.sleep(10)
counter += 1
r = None
data = None
datetime = None
value = None
How can I "force" a refresh of the page with requests.get()?
Upvotes: 1
Views: 1637
Reputation: 13434
It turns out this particular website doesn't continuously refresh on its own, unless the request comes from its parent url.
r = s.get(currenturl, headers={'Referer' : 'https://originalurl.com/example'})
I had to include the original parent URL as referer. Now it works as expected:
2020-10-13T15:32:27: 889W
2020-10-13T15:32:37: 889W
2020-10-13T15:32:47: 884W
2020-10-13T15:32:57: 884W
2020-10-13T15:33:07: 894W
Upvotes: 1