Reputation: 93
I am trying to retrieve label text next to checkboxes on a form from only those that are checked.
Here is the html:
<div class="x-panel-bwrap" id="ext-gen1956"><div
class="x-panel-body" id="ext-gen1957" style="width: 226px;">
<div class="x-form-check-wrap" id="ext-gen1959"><input type="checkbox" autocomplete="off" id="ext-comp-1609" name="ext-comp-1609" class=" x-form-checkbox x-form-field">
<label for="ext-comp-1609" class="x-form-cb-label" id="ext-gen1960">labeltext1</label></div>
<div class="x-form-check-wrap" id="ext-gen1961"><input type="checkbox" autocomplete="off" id="ext-comp-1607" name="ext-comp-1607" class=" x-form-checkbox x-form-field">
<label for="ext-comp-1607" class="x-form-cb-label" id="ext-gen1962">labeltext2</label></div>
<div class="x-form-check-wrap" id="ext-gen1963"><input type="checkbox" autocomplete="off" id="ext-comp-1605" name="ext-comp-1605" class=" x-form-checkbox x-form-field" checked="">
<label for="ext-comp-1605" class="x-form-cb-label" id="ext-gen1964">labeltext3</label></div>
The label I want to get that is beside a checked box is differentiated by the attribute checked=""
for checkboxes in soup.find_all('input', attrs={"id":"ext-comp-1609"}):
if checkboxes.find('input', attrs={"checked":""}):
label_1 = soup.find('label',{'id':'ext-gen1960'}).text
print(label_1)
else:
continue
for checkboxes in soup.find_all('input', attrs={"id":"ext-comp-1607"}):
if checkboxes.find('input', attrs={"checked":""}):
label_2 = soup.find('label',{'id':'ext-gen1962'}).text
print(label_2)
except:
continue
for checkboxes in soup.find_all('input', attrs={"id":"ext-comp-1605"}):
if checkboxes.find('input', attrs={"checked":""}):
label_3 = soup.find('label',{'id':'ext-gen1964'}).text
print(label_3)
else:
continue
My problem is that this grabs the labels whether they are checked or not. I have tried using has_attr() as well but it yields the same results.
Tried solutions:
soup = BeautifulSoup(browser.page_source, 'html.parser')
for checkbox in soup.find_all('input', checked=True):
print(checkbox.label.get_text())
and
soup = BeautifulSoup(browser.page_source, 'html.parser')
for checkbox in soup.select('input[checked]'):
print(checkbox.label.get_text())
for checkbox in soup.find_all('input', checked=True):
print(checkbox.find_next_sibling("label").get_text())
Upvotes: 3
Views: 3610
Reputation: 473763
You should apply the checked=True
check for all the input
elements. Then, get the inner label
element and it's text:
soup = BeautifulSoup(data, "html.parser")
for checkbox in soup.find_all('input', checked=True):
print(checkbox.label.get_text())
Note that for html5lib
or lxml
, you would need a different way to get to the labels:
soup = BeautifulSoup(data, "html5lib")
for checkbox in soup.find_all('input', checked=True):
print(checkbox.find_next_sibling("label").get_text())
Works for me on your input data:
In [1]: from bs4 import BeautifulSoup
In [2]: data = """your HTML here"""
In [3]: soup = BeautifulSoup(data, "html.parser")
In [4]: for checkbox in soup.find_all('input', checked=True):
...: print(checkbox.label.get_text())
...:
Can Submit Expense Reports
Upvotes: 3
Reputation: 1727
BeautifulSoup check checked
attribute with True
or False
, not ""
.
so you can change like this:
for checkboxes in soup.find_all('input', attrs={"id":"ext-comp-1609"}):
if checkboxes.find('input', attrs={"checked":True}):
label_1 = soup.find('label',{'id':'ext-gen1960'}).text
print(label_1)
else:
continue
Upvotes: -1