Reputation: 73
I am trying to scrape data from this website. The drop down menus populate based data entered, so I am making multiple post requests like this:
url = 'http://59.180.234.21:85/index.aspx'
with requests.Session() as session:
response = session.get(url)
soup = BeautifulSoup(response.content, "html5lib")
data = {
'ddlDistrict': '165',
'__VIEWSTATE': soup.find('input', {'name': '__VIEWSTATE'}).get('value', ''),
'__EVENTVALIDATION': soup.find('input', {'name': '__EVENTVALIDATION'}).get('value', ''),
}
response = session.post(url, data=data)
soup = BeautifulSoup(response.content, "html5lib")
data = {
'ddlDistrict': '165',
'ddlPS': '11',
'__VIEWSTATE': soup.find('input', {'name': '__VIEWSTATE'}).get('value', ''),
'__EVENTVALIDATION': soup.find('input', {'name': '__EVENTVALIDATION'}).get('value', ''),
}
response = session.post(url, data=data)
soup = BeautifulSoup(response.content, "html5lib")
data = {
'ddlDistrict': '165',
'ddlPS': '11',
'txtRegNo':'100',
'ddlYear': '2011',
'__VIEWSTATE': soup.find('input', {'name': '__VIEWSTATE'}).get('value', ''),
'__EVENTVALIDATION': soup.find('input', {'name': '__EVENTVALIDATION'}).get('value', ''),
}
response = session.post(url, data=data)
After doing this the last page has a html table with a button which I can click and look at the report. I want to be able to simulate clicking the button and getting the response which then I can parse using BS. Please let me know how to be able to do it. Sample input, District: "New Delhi Distt", Police Station:"Con.Place", FirNo:"100", Year:"2011" will give you one Fir to view. The button has the following code:
onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("DgRegist$ctl03$imgDelete", "", true, "", "", false, false))"
Upvotes: 2
Views: 4281
Reputation: 1539
If you can generate the http request the button is making, then you'll have the data you want. If the button is not making any requests then the data is already there somewhere and you just need to find it and parse it out.
EDIT:
In your case it's submitting the form data to a redirect to the same page. for this you would include the form data in the request to the page and it would have the resulting data in the response. For example:
import requests
headers = {
'Origin': 'http://59.180.234.21:85',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.8',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
'Content-Type': 'application/x-www-form-urlencoded',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Cache-Control': 'max-age=0',
'Referer': 'http://59.180.234.21:85/index.aspx',
'Connection': 'keep-alive',
}
data = [
('__EVENTTARGET', ''),
('__EVENTARGUMENT', ''),
('__LASTFOCUS', ''),
('__VIEWSTATE', '/wEPDwUJMTQ2MDgwNjA1D2QWAgIDD2QWAgIFD2QWBAIBD2QWCGYPZBYEAgEPZBYCAgEPEGRkFgFmZAIDD2QWAgIBDxBkZBYAZAIBD2QWBAIBD2QWAgIBDxAPFgYeDURhdGFUZXh0RmllbGQFCENpdHlOYW1lHg5EYXRhVmFsdWVGaWVsZAUIQ2l0eUNvZGUeC18hRGF0YUJvdW5kZ2QQFREMLS0tU0VMRUNULS0tDUNFTlRSQUwgRElTVFQSQ1JJTUUgQU5EIFJBSUxXQVlTEEVBU1QgREVMSEkgRElTVFQJSUdJIERJU1RUD05FVyBERUxISSBESVNUVAtOT1JUSCBESVNUVBBOT1JUSCBFQVNUIERJU1RUEE5PUlRIIFdFU1QgRElTVFQLT1VURVIgRElTVFQLU09VVEggRElTVFQQU09VVEggRUFTVCBESVNUVBBTT1VUSCBXRVNUIERJU1RUElNQRUNJQUwgQ0VMTCBESVNUVA5TUFVXICYgQyBESVNUVAlWSUdJTEFOQ0UKV0VTVCBESVNUVBURDC0tLVNFTEVDVC0tLQMxNjIDMTY0AzE2OAMxNjkDMTY1AzE2NgMxNzMDMTcyAzE3NAMxNjcDOTU1AzE3MQM5NTQDOTUzAzE2MQMxNzAUKwMRZ2dnZ2dnZ2dnZ2dnZ2dnZ2cWAQIFZAIDD2QWAgIBDxAPFgYfAAUHUFNfTmFtZR8BBQdQU19Db2RlHwJnZBAVCQwtLS1TRUxFQ1QtLS0PQkFSQUtIQU1CQSBST0FEDUNIQU5BS1lBIFBVUkkKQ09OLiBQTEFDRQpFWEguIEdST1VOC01BTkRJUiBNQVJHClBULiBTVFJFRVQKVElMQUsgTUFSRwtUVUdMQUsgUk9BRBUJDC0tLVNFTEVDVC0tLQIwMgIwNwIxMQIxMgIxNQIyMgIzNQIzNhQrAwlnZ2dnZ2dnZ2cWAQIDZAICD2QWBAIBD2QWAgIBDw8WAh4JTWF4TGVuZ3RoAgRkZAIDD2QWAgIBDxBkDxYHAgECAgIDAgQCBQIGAgcWBxAFBDIwMTcFBDIwMTdnEAUEMjAxNgUEMjAxNmcQBQQyMDE1BQQyMDE1ZxAFBDIwMTQFBDIwMTRnEAUEMjAxMwUEMjAxM2cQBQQyMDEyBQQyMDEyZxAFBDIwMTEFBDIwMTFnZGQCAw9kFgRmD2QWAgIBDxBkZBYBAgNkAgIPZBYCAgEPDxYCHgdFbmFibGVkaGRkAgMPZBYCAgEPZBYCZg9kFgICAQ9kFgQCAQ88KwALAGQCAw88KwARAgEQFgAWABYADBQrAABkGAEFCmdyZG12dGhlZnQPZ2SPDrK3c7Ukzq5Wg/XtZSQMgDzEoWpRz8kXOVH1TO1LcA=='),
('__EVENTVALIDATION', '/wEdAC8iT6D3HjIr+ivdq0yBTgClCsHRaAEHIr772zKgggdQ+5cM7ByNsRG4qWi12q7B1tveFDGmjlPiBn9IJO8m9jt8W1Wcqc3FqlgV9EENz1OdJenvj2TG96ujSrFeprbtr3RTWKEdLSZa5NFLztoz81urAMmLvBzV7Qyb4qeGafdxuGr4cVZnct4CZh3KKsvt+xdAs0fg094ls2+uRMaFDPjjvXQmtkg7agsuhug+xMVSXXqKkbM01pitokD3Lzhr/+Zrc1JkJBoj+hAGr8ppVSNG4Yj6XkYB+ZGeix5+udiv9J9IjbG0sujSnR9YEqeLFuIKGVNDezkrxdfUawGK33AxvjAuIFmExdxunofmSVMj2KhPcg/6G9KkHuC16bwbWAqSNP2Vcw4/0wky0Un3Ssd3cGZtjtv+8Amihean2n5uODEqvswSsIcl9+U0P3atZA9gLfz10VlY0S1jS6520f4SrEv7IkN+08PXTozm9OT6/xtTbG8qE+XuugkwabaWLRSnp8pclR+ltj186j/FXuFQADgLnY9pn1HgIJ6W1oaeYRGUECgQhKzewPcXKgm68keQY5UuqQXqAyLatchak9gZ0UXh+krR/3fyyNtTnsY2m8PCEGuPl86vYAMVmqqL9lXoXDEtci8mednFEKQQYva+qH6WXxs8JPfC5HROATEan29Lv0JBrmCBZS2sro8ULkaKOxbg8uzVwdeGr6v29r+3doU6WdnwFP0DXPL1dqxkGAcZoyyvxsCvu30nzr6m7V8lgJSWBob7Dm8GjVgW5r9J4pnX0P+2bLZvBfOH/t4fWMmWiUd3VkQPcKR+pddTuBtpJk290kZ4wQ4JdvCFsSKdBaNizvIH0xP0v3ruMbsMtxjvy3Vie7D95PeNV8/hUPt4D+GqPsOH44Eo2T+LfQkxwBWNveA+4s3aFDJlbkXzUPNrXlzDLLAaZVBaziFS2sS3u5FK3YA3jSyXSEoDlVEvjtTdVzRZn7DFyWrI8V/OY49Qu8R8qTviVpgIZnzlz1HnUusdQsXU9clbfRlGQn3F'),
('ddlDistrict', '165'),
('ddlPS', '11'),
('txtRegNo', '100'),
('ddlYear', '2011'),
('txRegFromDt', ''),
('txRegToDt', ''),
('txtCompNM', ''),
('btnSearch', 'Search'),
]
response = requests.post('http://59.180.234.21:85/index.aspx', headers=headers, data=data)
print(response.content)
>>> b'\r\n\r\n\r\n<!DOCTYPE html PUBLIC ...... FIR No.</a></td><td style="width:10%;"><a href="javascript:__doPostBack('DgRegist$ctl02$ctl03','')" style="color:Black;">Fir Year</a></td><td style="width:10%;">FIR Date</td><td style="width:15%;">\r\n View FIR\r\n </td>\r\n\t\t\t</tr><tr class="DataItemStyle ">\r\n\t\t\t\t<td>0100</td><td>2011</td><td>29-05-2011</td><td>\r\n <input type="image" name="DgRegist$ctl03$imgDelete" id="DgRegist_ctl03_imgDelete" src="Images/print.gif" ... ... \r\n</form>\r\n</body>\r\n</html>\r\n'
Upvotes: 4