mcansado
mcansado

Reputation: 2164

Getting value from hidden form with BeautifulSoup

I'm trying to scrape a website which has the following in its HTML

<form id="__AjaxAntiForgeryForm" action="#" method="post">
   <input name="__RequestVerificationToken" type="hidden" value="LOUesP09TLS3suKJk4dF5hIxeo-LmDWLxX8xqwIHYnj-JqR29qDcGA_mtHXvyZIej83qG3FfBBs2nuzk1EY6onTuszY1">
</form>

and I'm trying to extract the value with BeautifulSoup using

page = urllib2.urlopen(LOGIN_URL)
# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, "html.parser")

form = soup.find("form", {"id": "__AjaxAntiForgeryForm"})

which correctly returns

<form action="#" id="__AjaxAntiForgeryForm" method="post"><input name="__RequestVerificationToken" type="hidden" value="zd7XHXyVs7EgqObLzIfm9k4bw1cWfcddhfDZ9Mp8TibBaAJUz-yAp1ZBuKS1iJtEAvmI1WG_EYnbmXBnWzuKWJxfl8U1"/></form>

My problem is in extracting just the value from that tag.

I've tried this answer and using

form = soup.find("form", {"id": "__AjaxAntiForgeryForm"})['value']

based on this answer but it just returns KeyError: 'value'.

I could convert it to a string and use regex to extract the value but that seems clunky and there must be a cleaner way of doing it using BeautifulSoup.

Any ideas?

Upvotes: 1

Views: 3377

Answers (3)

Harshil Prajapati
Harshil Prajapati

Reputation: 1

from bs4 import BeautifulSoup

url1 = "LOGINURL"

soup1 = BeautifulSoup(url1, "html.parser")

form1 = soup1.find('input', {'name':'__RequestVerificationToken'})

print(form1.get('value'))

Upvotes: 0

ergesto
ergesto

Reputation: 397

from bs4 import BeautifulSoup

html = '''<form id="__AjaxAntiForgeryForm" action="#" method="post">
           <input name="__RequestVerificationToken" type="hidden" value="LOUesP09TLS3suKJk4dF5hIxeo-LmDWLxX8xqwIHYnj-JqR29qDcGA_mtHXvyZIej83qG3FfBBs2nuzk1EY6onTuszY1">
        </form>'''


soup = BeautifulSoup(html, "html.parser")
value = soup.find('input', {'name':'__RequestVerificationToken'})['value']
print value

Upvotes: 1

Rakesh
Rakesh

Reputation: 82765

Use .attrs['value']

Ex:

from bs4 import BeautifulSoup
s = """<form id="__AjaxAntiForgeryForm" action="#" method="post">
   <input name="__RequestVerificationToken" type="hidden" value="LOUesP09TLS3suKJk4dF5hIxeo-LmDWLxX8xqwIHYnj-JqR29qDcGA_mtHXvyZIej83qG3FfBBs2nuzk1EY6onTuszY1">
</form>"""
soup = BeautifulSoup(s, "html.parser")
form = soup.find("form", {"id": "__AjaxAntiForgeryForm"})
print( form.input.attrs['value'] )

Output:

LOUesP09TLS3suKJk4dF5hIxeo-LmDWLxX8xqwIHYnj-JqR29qDcGA_mtHXvyZIej83qG3FfBBs2nuzk1EY6onTuszY1

Upvotes: 1

Related Questions