Hanalia
Hanalia

Reputation: 197

python posts requests not rendering table

I am trying to scrape a html table from the below URL.

https://www.customs.go.jp/toukei/srch/indexe.htm?M=05&P=1,2,,,,,,,,1,0,2020,0,12,0,2,230660,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,200

Through the Chrome developer tools, I found that the actual data is from a redirected url, and I have made the code as below :

import requests

           
headers={'User=Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36'}

data = {'CW_SEARCHID': 'JCCHT05S',
'CW_JAPANKBN': '2',
'CW_IMPKBN': '2',
'CW_YMKBN': '1',
'CW_SYY': '2020',
'CW_SMM': '12',
'CW_HSKBN': '2',
'CW_HSCODE': '230660',
'CW_KUNIKBN': '1',
'CW_ZMKBN': '1',
'CW_MEISAICNT': '200'}

newurl = "https://www.customs.go.jp/JCWSV02/servlet/JCWSV02"
r2 = requests.post(newurl,headers=headers,data=data)
print (r2.text)

However the above code does not get the table results. I do not know the reason.

The attempts that I have tried :

  1. adding cookies as input

I have tried to add the cookies as below, but the results were the same.

cookies = {'visid_incap_763612':'dXFhIavZRrW8jvit8CkY9zirL2AAAAAAQUIPAAAAAACs+oxBQjxSp9TdZl25YI/Y','incap_ses_948_763612':'lRxvM4bArjwmSaqzRPknDYHIL2AAAAAAFXLtsiRyEhyFOCzgsz8MXA=='}

  1. putting the raw data instead of json format I have tried to put my data as a raw string, but the results were the same.
data = "CW_SEARCHID=JCCHT05S&CW_JAPANKBN=2&CW_IMPKBN=2&CW_CARGOKBN=&CW_SUMKBN=&CW_SPCODE=&CW_SPNAME=&CW_YMSORTKBN=&CW_SISUKBN=&CW_SENKIKBN=&CW_HKKBN=&CW_YMKBN=1&CW_KI=&CW_SYY=2020&CW_EYY=&CW_SMM=12&CW_EMM=&CW_HSKBN=2&CW_HSCODE=230660&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_KUNIKBN=1&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_ZMKBN=1&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_MEISAICNT=200"
  1. Since the URL is an HTTPS URL, I tried adding 'verify=True' to my requests object, but the results were the same
r2 = requests.post(newurl,headers=headers,data=data,cookies=cookies,verify=True)

Can anyone give me some advice?

Upvotes: 1

Views: 257

Answers (1)

SIM
SIM

Reputation: 22440

Try the following to get the required response. Turn out that you need to add all the keys and values within data which you didn't do. As a quick test I did the following and got expected results:

import requests
         
headers = {
    'User=Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36',
    'referer':'https://www.customs.go.jp/toukei/srch/jccht00p.htm',
    'content-type': 'application/x-www-form-urlencoded'
}

data = "CW_SEARCHID=JCCHT05S&CW_JAPANKBN=2&CW_IMPKBN=2&CW_CARGOKBN=&CW_SUMKBN=&CW_SPCODE=&CW_SPNAME=&CW_YMSORTKBN=&CW_SISUKBN=&CW_SENKIKBN=&CW_HKKBN=&CW_YMKBN=1&CW_KI=&CW_SYY=2020&CW_EYY=&CW_SMM=12&CW_EMM=&CW_HSKBN=2&CW_HSCODE=230660&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_HSCODE=&CW_HSNAME=&CW_KUNIKBN=1&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_KUNICODE=&CW_KUNINAME=&CW_ZMKBN=1&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_ZMCODE=&CW_ZMNAME=&CW_MEISAICNT=200"

newurl = "https://www.customs.go.jp/JCWSV02/servlet/JCWSV02"
r2 = requests.post(newurl,headers=headers,data=data)
print (r2.text)

Upvotes: 2

Related Questions