Reputation: 59
I am using BeautifulSoup to search for the class 'student-login' from the URL https://www.champlain.edu/current-students. I then want to further search within that class and return the complete line(s) if it contains either the string 'username' or 'password'. My working code returns everything within the class but I am having no luck enhancing it to get the specific lines only containing 'username' or 'password'. I have included a screen capture of my current output. Any guidance would be much appreciated. Thanks!!
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen('https://www.champlain.edu/current-students')
bs = BeautifulSoup(html.read(), 'html.parser')
soup = bs.find(class_='student-login')
print(soup)
Upvotes: 1
Views: 31
Reputation: 87124
For the sake of variety, you can use find()
. Because the tags that you are looking for have unique id
attributes you can find them directly:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen('https://www.champlain.edu/current-students')
bs = BeautifulSoup(html.read(), 'html.parser')
username = bs.find(id="login-username")
password = bs.find(id="login-password")
>>> print(username)
<input id="login-username" name="username" placeholder="Username" type="text"/>
>>> print(password)
<input id="login-password" name="password" placeholder="Password" type="password"/>
Upvotes: 1
Reputation: 5562
This should give you the input fields:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen('https://www.champlain.edu/current-students')
bs = BeautifulSoup(html.read(), 'html.parser')
print(bs.select_one('#login-username'))
print(bs.select_one('#login-password'))
This is using CSS selector, the #
in front means you're selecting any element with the id = login-username which I assume is what you want.
Upvotes: 1