Reputation: 17
I am using python BS4
to scrape the https://skinup.gg website. I am trying to get the multiplier class, in order, from a website.
I tried to scrape the information by taking all the data from the div history class. However it just returns []
and I am stumped on how to get the multipliers.
I wonder if it is because the div tag class values are constantly changing. Which leads me to my second question: how do they have dynamic values in html tags? Is it done via Javascript
?
Excuse my grammar.
Here is my code:
import urllib.request
import requests
from bs4 import BeautifulSoup
import urllib
page = requests.g et("https://skinup.gg/"
soup = BeautifulSoup(page.content, "html.parser")
print(soup.find_all('div', attrs={'class': 'win'}))
Relevant website code:
<div class="history"><div class="win" style="">
<time class="date">23:05</time>
<span class="multiplier">2.19</span>
</div><div class="win" style="">
<time class="date">23:04</time>
<span class="multiplier">2.62</span>
</div><div class="lose" style="">
<time class="date">23:04</time>
<span class="multiplier">1.75</span>
</div><div class="lose" style="">
<time class="date">23:04</time>
<span class="multiplier">1.00</span>
</div><div class="lose" style="">
<time class="date">23:04</time>
<span class="multiplier">1.21</span>
</div><div style="">
<time class="date">23:03</time>
<span class="multiplier">1.82</span>
</div><div class="lose" style="">
<time class="date">23:03</time>
<span class="multiplier">1.00</span>
</div><div class="win" style="">
<time class="date">23:03</time>
<span class="multiplier">2.91</span>
</div><div class="lose" style="">
<time class="date">23:02</time>
<span class="multiplier">1.01</span>
</div><div class="win" style="">
<time class="date">23:02</time>
<span class="multiplier">1184.44</span>
</div><div class="win" style="">
<time class="date">23:01</time>
<span class="multiplier">36.81</span>
</div><div class="lose" style="">
<time class="date">22:59</time>
<span class="multiplier">1.38</span>
</div><div class="win" style="">
<time class="date">22:59</time>
<span class="multiplier">2.42</span>
</div><div class="win" style="">
<time class="date">22:59</time>
<span class="multiplier">8.00</span>
</div><div class="win" style="">
<time class="date">22:58</time>
<span class="multiplier">3.42</span>
</div><div class="win" style="">
<time class="date">22:57</time>
<span class="multiplier">2.04</span>
</div><div class="lose" style="">
<time class="date">22:57</time>
<span class="multiplier">1.17</span>
</div><div class="lose" style="">
<time class="date">22:57</time>
<span class="multiplier">1.24</span>
</div><div class="lose" style="">
<time class="date">22:57</time>
<span class="multiplier">1.11</span>
</div><div class="lose" style="">
<time class="date">22:56</time>
<span class="multiplier">1.53</span>
</div>
</div>
Upvotes: 0
Views: 162
Reputation: 384
As t.m.adam mentioned , urllib or requests can't get the dynamic page source.
But when you see the page you linked with chrome developer tools, you can see that div win class is generated when round.multiplier >2.
These are received by 'socketcluster/' which use wss protocol.
so you should use python wss module to acheive your goal.
Upvotes: 1
Reputation: 931
For the website in question you will need to use selenium to get the data you want.
Upvotes: 1
Reputation: 864
First of all, this should throw up syntax errors:
page = requests.g et("https://skinup.gg/"
Change it to:
page = requests.get("https://skinup.gg/")
I suggest using lxml instead of html.parser, it's faster and lighter.
Now, to answer your question,
The div segments with win as class attribute are under the div segment with history class attribute. So first you search for history and then search for win within the obtained list.
But, when i ran your script and cross checked the page source of the site you linked, there is no div segment with win class attribute.
Could you mention, where you got the Relevant website code from?
Upvotes: 2