Trunks
Trunks

Reputation: 17

how to scrape through drop down menu using requests

I want to scrape through web pages that are selected from drop down as shown below

 <h1>Scraping Test</h1>
    <form action="/tests/scraping" method='post'>
        <input type="hidden" name="csrf_token" value="1499585369##d2d1570f820aec0589b3bd5f4ab4e7df913e25ff"/>
        <table>
            <tr>
                <td>Select Ward: </td>
                <td>
                    <select name="ward">
                        <option value=''>-- select --</option>
                        <option value='DHANLAXMICOMPLEX'>DHANLAXMICOMPLEX</option>
                        <option value='POTALIYA'>POTALIYA</option>
                        <option value='ARJUNTOWER'>ARJUNTOWER</option>
                        <option value='NEWCLOTHMARKET'>NEWCLOTHMARKET</option>
                        <option value='CHANAKYAPURI'>CHANAKYAPURI</option>
                        <option value='BHAIKAKANAGAR'>BHAIKAKANAGAR</option>
                        <option value='RADHASWAMYROAD'>RADHASWAMYROAD</option>
                        <option value='SATADHAR'>SATADHAR</option>
                        <option value='AMRUTAVIDYALAYA'>AMRUTAVIDYALAYA</option>
                        <option value='AGARWALTOWERS'>AGARWALTOWERS</option>
                        <option value='RANNAPARK'>RANNAPARK</option>
                        <option value='IIM'>IIM</option>
                        <option value='VEJALPURWARD'>VEJALPURWARD</option>
                        <option value='GITAMANDIR'>GITAMANDIR</option>
                    </select>
                </td>
                <td><input type="submit" value="Search" class="search"/></td>
            </tr>
        </table>

how to request a webpage from that drop down menu and there is also a search button my code

import requests, csv
from lxml import html

def get_all_pages():

   payload = {'value':'DHANLAXMICOMPLEX'}
   url = requests.get('https://recruitment.advarisk.com/tests/scraping',data=payload)
   print(url.text)

Upvotes: 1

Views: 5714

Answers (1)

Andersson
Andersson

Reputation: 52695

There is a token value which you can get from this HTML element

<input type="hidden" name="csrf_token" value="1499585369##d2d1570f820aec0589b3bd5f4ab4e7df913e25ff"/>

and use in your request. Try to use below code and let me know in case of any issues

import lxml.html
import requests

url = "https://recruitment.advarisk.com/tests/scraping"
s = requests.session()
r = s.get(url)
source = lxml.html.document_fromstring(r.content)
token = source.xpath('//input[@name="csrf_token"]/@value')[0]

headers = {'Referer': 'https://recruitment.advarisk.com/tests/scraping'}
data = {'csrf_token': token, 'ward': 'DHANLAXMICOMPLEX'}

print(s.post(url, data=data, headers=headers).text)

Upvotes: 1

Related Questions