Reputation: 1180
I would like to gather data behind a popup in this page. https://www.commonsense.org/education/game/garrys-mod
I am trying to gather data in the popup Subjects & skills. I know I could use selenium, but I would rather not if it is not useful.
The data I am trying to gather is in there:
subjectSkills = gameSoup.find('div',class_='popper popper-popover subjects-skills')
However, it returns None, since it is behind a popup that looks like this:
<a href="#" id="subjects-skills" class="body-color" data-toggle="popover" data-content=".subjects-skills" data-arrow="false" target="_self">Subjects & skills</a>
When the arrow button has been clicked, the value of data-arrow
changes to true and this might be a solution, but I am unsure how/if it is possible to change this value.
Thanks
Upvotes: 3
Views: 372
Reputation: 2211
If you are looking for the popup from subjects I used
res = soup.findAll("div", {"class": "subjects-skills__item"})
and the return was is:
<div class="subjects-skills__item">
<h5 class="subjects-skills__label">Subjects</h5>
<ul>
<li>Science</li>
</ul>
</div>,
<div class="subjects-skills__item">
<h5 class="subjects-skills__label">Skills</h5>
<ul>
<li>Creativity</li>
<li>Critical Thinking</li>
</ul>
</div>
I got it by clicking the popup.. Highlighting the text, then right-click and go to inspect to locate the class.
from bs4 import BeautifulSoup as bs4
import requests
def get_data():
url = 'https://www.commonsense.org/education/game/garrys-mod'
r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})
html_bytes = r.text
soup = bs4(html_bytes, 'lxml')
res = soup.findAll("div", {"class": "subjects-skills__item"})
return res
test1 = get_data()
If you just want the text..
# For just the Text
for i in test1:
print(i.text)
returns
Subjects
Science
Skills
Creativity
Critical Thinking
Upvotes: 3