Python - BeautifulSoup: Pull stock data from Morningstar

Question

I am trying to extract the two data points from Morning Star website for a list of companies and save it to a text file, but I am not sure how to approach this task. Below is my code:

from bs4 import BeautifulSoup as BS

thislist = ["AAPL","FB","TSLA","DIS"] 
for symbol in thislist:
    print ('Getting data for ' + symbol + '...
')

# extract from this website
url="https://www.morningstar.com/stocks/xnas/" + symbol + "/quote"
        
soup = BS(url)
        
# Find the Value of Last Close Price
for text in soup.find_all('div class', name_='Last Close'):
    Last_Close = text.find_all('dp-value price-down')
    print(Last_Close)     
        
# Find the Value of its Market Cap
for text in soup.find_all('div class', name_='Market Cap'):
    Market_Cap = text.find_all('dp-value')
    print(Market_Cap)      
        
# Print the table
print(symbol, Last_Close, Market_Cap)
            
# Save the data in a .txt file
df.to_csv(r'c:\data	esting.txt', header=None, index=None, sep=' ', mode='a')

Joe · Accepted Answer

First of all this code will get you the information you need:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
import time

symbols = ["AAPL", "FB", "TSLA", "DIS"]


def download_data(symbol):
    url = f'https://www.morningstar.com/stocks/xnas/{symbol}/quote'
    s = Service(ChromeDriverManager().install())
    op = webdriver.ChromeOptions()
    op.headless = True
    driver = webdriver.Chrome(service=s, options=op)
    driver.get(url)

    # symbol, Last_Close, Market_Cap
    time.sleep(2)


    last_close = driver.find_element(by=By.XPATH,
                                         value='//*[@id="__layout"]/div/div[2]/div[3]/main/div[2]/div/div/div[1]/div[1]/div/sal-components/section/div/div/div/sal-components-quote/div/div/div/div/div/div[2]/ul/li[1]/div/div[2]')
    market_cap = driver.find_element(by=By.XPATH,
                                          value='//*[@id="__layout"]/div/div[2]/div[3]/main/div[2]/div/div/div[1]/div[1]/div/sal-components/section/div/div/div/sal-components-quote/div/div/div/div/div/div[2]/ul/li[7]/div/div[2]')
    return symbol, last_close.text, market_cap.text


for symbol in symbols:
    print(download_data(symbol))

The output looks like this:

('AAPL', '164.51', '2.6529 Tril')
('FB', '316.56', '843.3460 Bil')
('TSLA', '996.27', '947.9256 Bil')

The page for Disney actually doesn't exist so you might want to consider checking the url.

You can go through and save it as you want in a data frame to export to csv. I would suggest using Selenium instead of Beautiful Soup. You eliminate the headache of trying to find information that is dynamically rendered using Javascript... sometimes Beautiful Soup has trouble there. Selenium acts just like you would when you visit a webpage.

Also in your code, you try soup = BS(url). I believe you need to make a HTTP requests using the requests library in python, but I haven't used BS in a while.

Python - BeautifulSoup: Pull stock data from Morningstar

Answers (2)

Related Questions