David
David

Reputation: 379

How to select value from dropdown item using requests in Python 3?

I want to scrape data from the website https://xlnindia.gov.in/frm_G_Cold_S_Query.aspx. I have to select the State as Delhi, District as Adarsh Nagar (4) & click on Search button, and scrape all the information.

So far I tried using the given below code as

import requests
from bs4 import BeautifulSoup

Error was coming as 'HTTPS 443 SSL', which I ressolved using 'verify = False

resp = requests.get('https://xlnindia.gov.in/frm_G_Cold_S_Query.aspx',verify=False)
soup = BeautifulSoup(resp.text,"lxml")

dictinfo = {i['name']:i.get('value','') for i in soup.select('input[name]')}
dictinfo['ddlState']='Delhi'
dictinfo['ddldistrict']='Adarsh Nagar (4)'
dictinfo['__EVENTTARGET']='btnSearch'
dictinfo = {k:(None,str(v)) for k,v in dictinfo.items()}
r=requests.post('https://xlnindia.gov.in/frm_G_Cold_S_Query.aspx',verify=False,files=dictinfo)
r

Error: Response [500]

soup2

Error: Invalid postback or callback argument. Event validation is enabled using <pages enableEventValidation="true"/> in configuration or <%@ Page EnableEventValidation="true" %> in a page. For security purposes, this feature verifies that arguments to postback or callback events originate from the server control that originally rendered them. If the data is valid and expected, use the ClientScriptManager.RegisterForEventValidation method in order to register the postback or callback data for validation.

Can someone please help me to scrape it or get it done.

(I can only use REQUEST & BEAUTIFULSOUP library, no SELENIUM, MECHANIZE,etc. libraries. )

Upvotes: 0

Views: 1233

Answers (1)

SIM
SIM

Reputation: 22440

Try the script below to get the tabular results meant to be populated choosing two dropdown items as you stated above from that webpage. Turn out that you have to make two subsequent post requests to populate the results.

import requests
from bs4 import BeautifulSoup
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

url = 'https://xlnindia.gov.in/frm_G_Cold_S_Query.aspx'

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0'
    resp = s.get(url,verify=False)
    soup = BeautifulSoup(resp.text,"lxml")

    dictinfo = {i['name']:i.get('value','') for i in soup.select('input[name]')}
    dictinfo['ddlState'] = 'DL'

    res = s.post(url,data=dictinfo)
    soup_obj = BeautifulSoup(res.text,"lxml")

    payload = {i['name']:i.get('value','') for i in soup_obj.select('input[name]')}
    payload['ddldistrict'] = 'ADN'

    r = s.post(url,data=payload)
    sauce = BeautifulSoup(r.text,"lxml")
    for items in sauce.select("#dgDisplay tr"):
        data = [item.get_text(strip=True) for item in items.select("td")]
        print(data)

Output you may see in the console like:

['Firm Name', 'City', 'Licences', 'Reg. Pharmacists / Comp. Person']
['A ONE MEDICOS', 'DELHI-251/1, GALI NO.1, KH, NO, 739/251/1, NEAR HIMACHAL BHAWAN,SARAI PIPAL THALA, VILLAGE AZAD PUR,', 'R - 2', 'virender kumar, DPH, [22295-17/10/2013]']
['AAROGYAM', 'DELHI-PVT. SHOP NO. 1, GF, 121,VILLAGE BHAROLA', 'R - 2', 'avinesh bhadoriya, DPH, [27033-]']
['ABCO INDIA', 'DELHI-SHOP NO-452/22,BHUSHAN BHAWAN RING ROAD,FLYOVER AZAD PUR', 'W - 2', 'sanjay dubey , SSC, [C-P-03/01/1997]']
['ADARSH MEDICOS', 'DELHI-NORTHERN SIDE B-107, GALI NO. 1,,MAJLIS PARK, VILLAGE BHAROLA,', 'R - 2', 'dilip kumar, BPH, [28036-11/01/2018]']

Upvotes: 1

Related Questions