Helix
Helix

Reputation: 27

Can't get data from site using requests in Python

I'm trying to get text from this site. It is just a simple plain site with only text. When running the code below, the only thing it prints out is a newline. I should say that websites content/text is dynamic, so it changes over a few minutes. My requests module version is 2.27.1. I'm using Python 3.9 on Windows.

What could be the problem?

import requests

url='https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36',
}

content=requests.get(url, headers=headers)
print(content.text)

This is the example of how the website should look. Website screenshot

Upvotes: 0

Views: 612

Answers (2)

Anon Coward
Anon Coward

Reputation: 10828

That particular server appears to be gating responses not on the User-Agent, but on the Accept-Encoding settings. You can get a normal response with:

import requests
url = "https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN"
headers = {
    "Accept-Encoding": "gzip, deflate, br",
}
content = requests.get(url, headers=headers)
print(content.text)

Depending on how the server responds over time, you might need to install the brotli package to allow requests to decompress content compressed with it.

Upvotes: 1

shivankgtm
shivankgtm

Reputation: 1242

You just need to add user-agent like below.

import requests

url = "https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN"

payload={}
headers = {
    'User-Agent': 'PostmanRuntime/7.29.0',
    'Accept': '*/*',
    'Cache-Control': 'no-cache',
    'Host': 'www.spaceweatherlive.com',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive'
 }
response = requests.get(url, headers=headers)
print(response.text)

Upvotes: 0

Related Questions