cbos93
cbos93

Reputation: 35

Printing specific HTML values with Python

I am having some trouble printing only a specific value of the scraped html

This the specific line of HTML my program scrapes for

<input name="form_key" type="hidden" value="MmghsMIlPm5bd2Dw"/>

My code is as follows

import requests, time
from bs4 import BeautifulSoup
from colorama import Fore, Back, Style, init


print(Fore.CYAN + "Lets begin!"")
init(autoreset=True)

url = raw_input("Enter URL: ")

print(Fore.CYAN + "\nGetting form key")


r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

data = soup.find_all("input", {'name': 'form_key', 'type':'hidden'})

for data in data:
    print(Fore.YELLOW + "Found Form Key:")
    print(data)

The program scrapes it fine, but prints the entire line where I desire to only print "MmghsMIlPm5bd2Dw" (no quotes)

How can I achieve this??

I have tried things like

print soup.find(data).text

And

last_input_tag = soup.find("input", id="value")
print(last_input_tag)

But nothing has seemed to really work

Upvotes: 0

Views: 3475

Answers (2)

E. Ducateme
E. Ducateme

Reputation: 4248

More generically... presuming that there are multiple tags in the html:

from bs4 import BeautifulSoup

html = '''<title><p><input name="form_key" type="hidden" value="MmghsMIlPm5bd2Dw"/>
<input name="form_key" type="hidden" value="abcdefghijklmo"/>
<input name="form_key" type="hidden"/>
</p></title>'''

soup = BeautifulSoup(html, "html.parser")

We can search for all tags with the name input.

tags = soup.find_all('input')

We can then cycle through all the tags to retrieve those tags with value attributes. Because tags can be treated much like dictionaries under the hood, we can query for the attributes as though they were keys, using the *.get() method. This method looks for an attribute called value:

  • If it finds this attribute, the method returns the value associated with the attribute
  • If it cannot find the attribute, the *.get() method will return the default value you provide as a second argument:

To cycle through the tags...

for tag in tags:
    print(tag.get('value', 'value attribute not found'))

=== Output: ===
MmghsMIlPm5bd2Dw
abcdefghijklmo
value attribute not found

Upvotes: 0

veritaS
veritaS

Reputation: 519

if you print data and it shows you the whole input statement you should be able to print the value by specifying it

print(data.get('value'))

Please refere to documentation here https://www.crummy.com/software/BeautifulSoup/bs4/doc/

Upvotes: 3

Related Questions