Reputation: 113
I am trying to get the table contents of website Trading Statistics to analyze it. So I used python requests library in order to send a post request as the web page is a form. Using Mozila's Inspect feature I get post data and headers request data and request url. But I get an internal error from server and contents which are in the format of a json file is not shown. Here I included the code and the error. I don't know the reason I get this error and how to change my code to get the true contents.
import requests
import json
url='http://en.ime.co.ir/subsystems/ime/services/home/imedata.asmx/GetAmareMoamelatList'
headers={"Host": "en.ime.co.ir",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0",
"Accept": "text/plain, */*; q=0.01",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate",
"Content-Type": "application/json; charset=utf-8",
"X-Requested-With": "XMLHttpRequest",
"Content-Length": "119",
"Origin": "http://en.ime.co.ir",
"Connection": "keep-alive",
"Referer": "http://en.ime.co.ir/spot-trading-statistics.html",
"Cookie": "ASP.NET_SessionId=jh5e35r25o0mmlvvdotwsr2t; SiteBikeLoadBanacer=e565cf8d0d6f8936cafc1b4ba323aebc71ac38c59885ac721a10cf66fd62c302; AmareMoamelatGridTblsaveId.bs.table.columns=[0,1,2,3,4,5,7,8,9,10,11,12,15,17,20,21]"}
data={'fari':'false',
'GregorianFromDate':'2021/12/18',
'GregorianToDate':'2021/12/18',
'MainCat':'0',
'Cat':'0',
'SubCat':'0',
'Producer':'0'}
req=requests.post(url, verify=True,data=data,headers=headers)
print(req):
"<Response [500]>"
print(req.content):
"b'The page cannot be displayed because an internal server error has occurred.'"
Upvotes: 1
Views: 2529
Reputation: 330
The problem here is that the server only accepts valid JSON data.
Here's what you can do.
import requests
import json
url = "https://en.ime.co.ir/subsystems/ime/services/home/imedata.asmx/GetAmareMoamelatList"
# Minimal headers will do
headers = {
"accept": "text/plain, */*; q=0.01",
"accept-language": "en-US,en;q=0.9",
"content-type": "application/json; charset=UTF-8"
}
# Setup your body data here
payload = {
"Language": "1",
"fari": "false",
"GregorianFromDate": "2021/12/18",
"GregorianToDate": "2021/12/18",
"MainCat": "0",
"Cat": "0",
"SubCat": "0",
"Producer": "0"
}
# Convert the dict into a json string
payload_json = json.dumps(payload)
req = requests.post(url, verify=True, data=payload_json, headers=headers)
print(req.content)
In this case, some headers aren't really that important because you're sending your HTTP requests to an API endpoint.
The header you need to care here is content-type
. Setting it wrongly will cause problems.
To ensure that the destination server understood that you're trying to send a JSON body not a RAW body. You need to set the header "content-type": "application/json; charset=UTF-8"
A: You must encode your dict into a valid json string so that the server can process the data.
Here, let's see how the data looks like.
import json
#################
""" The server will definitely accept this. """
data = {
"Work" : "true",
"Sleep" : "false"
}
# Convert python dict into JSON string
data_json = json.dumps(data)
# It prints '{"Work": "true", "Sleep": "false"}'
print(repr(data_json))
#################
""" Most server should accept JSON string like this too, it's still valid. """
data = '{ \
"Work" : "true", \
"Sleep" : "false" \
}'
# It prints '{ "Work" : "true", "Sleep" : "false" }'
print(repr(data))
#################
""" The server will NOT accept this, this ISN'T valid. """
data = {
"Work" : "true",
"Sleep" : "false"
}
# It prints {'Work': 'true', 'Sleep': 'false'}
# Notice that it's missing quotation mark at start and end.
print(repr(data))
Upvotes: 2