jumbohiggins
jumbohiggins

Reputation: 11

Beautifulsoup webscraping.

I am trying to scrape data from DnDbeyond. I am using Beautifulsoup and python and have been able to get some of the information that I need by using the "Div" tag and "Find_all" classes but I can't seem to find the info from these formatted blocks that contain the characters stats.

<div class="ct-skills__col--skill">Animal Handling</div>

I should be able to just be able to search for soup.find("div", {"class": "ct-skills__col--skill"})

right?

This is what my current code looks like.

from bs4 import BeautifulSoup
import requests

resp = requests.get('https://www.dndbeyond.com/characters/4741434')
soup = BeautifulSoup(resp.text, 'lxml')

divTag = soup.find_all("div", {"class": "container"})

Which gets me

[<div class="container">
<div class="main content-container" id="content">
<section class="primary-content" role="main">
<div data-character-endpoint="/character/4741434/json" data-character-id="4741434" data-read-only="true" id="character-sheet-target"></div>
<script src="/Content/1-0-482-0/React/CharacterTools/dist/characterSheet.bundle.min.js" type="text/javascript"></script>
</section>
</div>
</div>]

I know that my info is under "character-sheet-target" but I can't figure out how to get the info / class under there.

Sorry if this is rambely I didn't know how to explain this well.

Upvotes: 1

Views: 604

Answers (2)

Martin-Gilles Lavoie
Martin-Gilles Lavoie

Reputation: 584

I'm nearly done fleshing out the entire structure.

Objective-C source include all class definitions.

https://github.com/mouser/BeyondDnD

Upvotes: 0

Sohan Das
Sohan Das

Reputation: 1620

You can use their json api, no need selenium, see the code below.

import requests
req = requests.get('https://www.dndbeyond.com/character/4741434/json')
print req.json()

Upvotes: 1

Related Questions