python list beautiful soup web scraping question

Question

I am completely new in python and programming in general. At the moment I am playing a little bit with beautiful soup library and I tried to extract some fonds data from a website. At the end I got a list with all data I am interested in (top holdings, top countries and top sectors). For each of this categories I got a list (or better bs4.element.ResultSet) like this

[ ,

My problem: The code above is onyl one element in my list. The next element looks similar but the data is for countires and then I have a further element for the sectors.

What is the best way to bring the asset names (Apple, Microsoft ... and the percentages 3.43, 2.77 ...) in a list or pandas-DataFrame to work with it?

The whole code so far is:

from bs4 import BeautifulSoup
import requests
import pandas as pd
asset_isin = "IE00BGHQ0G80"
url = f"https://www.fondsweb.com/de/{asset_isin}"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
data = soup.find_all("div", attrs={"class":"fw--chart fwwBreakdown"})
top_holdings = data[0]
top_countires = data[1]
top_sectors = data[2]

So with data[0] I get the output above starting with [div class=... but all as element [0].

Thanks in advance

user5386938 · Accepted Answer

I am unsure as to what you need but see the following...

# coding: UTF-8
import pandas as pd
from bs4 import BeautifulSoup
import requests
import json

asset_isin = "IE00BGHQ0G80"
url = f"https://www.fondsweb.com/de/{asset_isin}"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
charts = soup.select('div.fw--chart.fwwBreakdown')

data = {'name': [], 'data': []}
for d in charts:
    o = json.loads(d['data-breakdown'])
    for s in o['series']:
        data['name'].append(s['name'])
        data['data'].append(s['data'][0])

df = pd.DataFrame(data)

print(df)

python list beautiful soup web scraping question

Answers (1)

Related Questions