Guacaka283
Guacaka283

Reputation: 83

Web scraping using BeautifulSoup - link embedded behind the marked up text

I am trying to scrape data from https://eresearch.fidelity.com/eresearch/goto/markets_sectors/landing.jhtml. The goal is to get the latest 11 sectors' performance data from the US markets. But I cannot see the performance until I click on each sector. In other words, there is a link embedded behind each sector. I want a list of tuples, and each tuple should correspond to a sector and should contain the following data: the sector name, the amount the sector has moved, the market capitalization of the sector, the market weight of the sector, and a link to the fidelity page for that sector.

table of sectors

enter image description here

Below is the code I have so far. I got stuck on the part that I want to get the content of each sector. My code return nothing at all. Please help! Thank you in advance.

    import requests
    from bs4 import BeautifulSoup
    url = "https://eresearch.fidelity.com/eresearch/goto/markets_sectors/landing.jhtml"
    req = requests.get(url)
    soup = BeautifulSoup(req.content, "html.parser")
    
    links_list = list()
    next_page_link = soup.find_all("a", class_="heading1")
    for link in next_page_link:
        next_page = "https://eresearch.fidelity.com"+link.get("href")
        links_list.append(next_page)
    
    for item in links_list:
        soup2 = BeautifulSoup(requests.get(item).content,'html.parser')
        print(soup2)

Upvotes: 1

Views: 175

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195543

Try:

import requests
from bs4 import BeautifulSoup

url = "https://eresearch.fidelity.com/eresearch/goto/markets_sectors/landing.jhtml"
sector_url = "https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector={sector_id}"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

print(
    "{:<30} {:<8} {:<8} {:<8} {}".format(
        "Sector name",
        "Moving",
        "MktCap",
        "MktWght",
        "Link",
    )
)
for a in soup.select("a.heading1"):
    sector_id = a["href"].split("=")[-1]

    u = sector_url.format(sector_id=sector_id)
    s = BeautifulSoup(requests.get(u).content, "html.parser")

    data = s.select("td:has(.timestamp) span:nth-of-type(1)")

    print(
        "{:<30} {:<8} {:<8} {:<8} {}".format(
            s.h1.text, *[d.text for d in data][:3], u
        )
    )

Prints:

Sector name                    Moving   MktCap   MktWght  Link
Communication Services         +1.78%   $6.70T   11.31%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=50
Consumer Discretionary         +0.62%   $8.82T   12.32%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=25
Consumer Staples               +0.26%   $4.41T   5.75%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=30
Energy                         +3.30%   $2.83T   2.60%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=10
Financials                     +1.59%   $8.79T   11.22%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=40
Health Care                    +0.07%   $8.08T   13.29%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=35
Industrials                    +1.41%   $5.72T   8.02%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=20
Information Technology         +1.44%   $15.52T  28.04%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=45
Materials                      +1.60%   $2.51T   2.46%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=15
Real Estate                    +1.04%   $1.67T   2.58%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=60
Utilities                      -0.04%   $1.56T   2.42%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=55

Upvotes: 1

Related Questions