Paul Park
Paul Park

Reputation: 17

Python beautiful soup4- find_all returns "[]"

I am using python BS4 to scrape the https://skinup.gg website. I am trying to get the multiplier class, in order, from a website.

I tried to scrape the information by taking all the data from the div history class. However it just returns [] and I am stumped on how to get the multipliers.

I wonder if it is because the div tag class values are constantly changing. Which leads me to my second question: how do they have dynamic values in html tags? Is it done via Javascript?

Excuse my grammar.

Here is my code:

import urllib.request
import requests
from bs4 import BeautifulSoup
import urllib

page = requests.g et("https://skinup.gg/"
soup = BeautifulSoup(page.content, "html.parser")


print(soup.find_all('div', attrs={'class': 'win'}))

Relevant website code:

<div class="history"><div class="win" style="">
  <time class="date">23:05</time>
  <span class="multiplier">2.19</span>
</div><div class="win" style="">
  <time class="date">23:04</time>
  <span class="multiplier">2.62</span>
</div><div class="lose" style="">
  <time class="date">23:04</time>
  <span class="multiplier">1.75</span>
</div><div class="lose" style="">
  <time class="date">23:04</time>
  <span class="multiplier">1.00</span>
</div><div class="lose" style="">
  <time class="date">23:04</time>
  <span class="multiplier">1.21</span>
</div><div style="">
  <time class="date">23:03</time>
  <span class="multiplier">1.82</span>
</div><div class="lose" style="">
  <time class="date">23:03</time>
  <span class="multiplier">1.00</span>
</div><div class="win" style="">
  <time class="date">23:03</time>
  <span class="multiplier">2.91</span>
</div><div class="lose" style="">
  <time class="date">23:02</time>
  <span class="multiplier">1.01</span>
</div><div class="win" style="">
  <time class="date">23:02</time>
  <span class="multiplier">1184.44</span>
</div><div class="win" style="">
  <time class="date">23:01</time>
  <span class="multiplier">36.81</span>
</div><div class="lose" style="">
  <time class="date">22:59</time>
  <span class="multiplier">1.38</span>
</div><div class="win" style="">
  <time class="date">22:59</time>
  <span class="multiplier">2.42</span>
</div><div class="win" style="">
  <time class="date">22:59</time>
  <span class="multiplier">8.00</span>
</div><div class="win" style="">
  <time class="date">22:58</time>
  <span class="multiplier">3.42</span>
</div><div class="win" style="">
  <time class="date">22:57</time>
  <span class="multiplier">2.04</span>
</div><div class="lose" style="">
  <time class="date">22:57</time>
  <span class="multiplier">1.17</span>
</div><div class="lose" style="">
  <time class="date">22:57</time>
  <span class="multiplier">1.24</span>
</div><div class="lose" style="">
  <time class="date">22:57</time>
  <span class="multiplier">1.11</span>
</div><div class="lose" style="">
  <time class="date">22:56</time>
  <span class="multiplier">1.53</span>
</div>

                </div>

Upvotes: 0

Views: 162

Answers (3)

Park
Park

Reputation: 384

As t.m.adam mentioned , urllib or requests can't get the dynamic page source.
But when you see the page you linked with chrome developer tools, you can see that div win class is generated when round.multiplier >2.
These are received by 'socketcluster/' which use wss protocol.
so you should use python wss module to acheive your goal.

Upvotes: 1

NOP da CALL
NOP da CALL

Reputation: 931

For the website in question you will need to use selenium to get the data you want.

Upvotes: 1

Harshith Thota
Harshith Thota

Reputation: 864

First of all, this should throw up syntax errors:

page = requests.g et("https://skinup.gg/"

Change it to:

page = requests.get("https://skinup.gg/")

I suggest using lxml instead of html.parser, it's faster and lighter.

Now, to answer your question,

The div segments with win as class attribute are under the div segment with history class attribute. So first you search for history and then search for win within the obtained list.

But, when i ran your script and cross checked the page source of the site you linked, there is no div segment with win class attribute.

Could you mention, where you got the Relevant website code from?

Upvotes: 2

Related Questions