Obe
Obe

Reputation: 15

Python - BeautifulSoup: Pull stock data from Morningstar

I am trying to extract the two data points from Morning Star website for a list of companies and save it to a text file, but I am not sure how to approach this task. Below is my code:

from bs4 import BeautifulSoup as BS

thislist = ["AAPL","FB","TSLA","DIS"] 
for symbol in thislist:
    print ('Getting data for ' + symbol + '...\n')

# extract from this website
url="https://www.morningstar.com/stocks/xnas/" + symbol + "/quote"
        
soup = BS(url)
        
# Find the Value of Last Close Price
for text in soup.find_all('div class', name_='Last Close'):
    Last_Close = text.find_all('dp-value price-down')
    print(Last_Close)     
        
# Find the Value of its Market Cap
for text in soup.find_all('div class', name_='Market Cap'):
    Market_Cap = text.find_all('dp-value')
    print(Market_Cap)      
        
# Print the table
print(symbol, Last_Close, Market_Cap)
            
# Save the data in a .txt file
df.to_csv(r'c:\data\testing.txt', header=None, index=None, sep=' ', mode='a')

Upvotes: 1

Views: 1235

Answers (2)

Joe
Joe

Reputation: 217

First of all this code will get you the information you need:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
import time

symbols = ["AAPL", "FB", "TSLA", "DIS"]


def download_data(symbol):
    url = f'https://www.morningstar.com/stocks/xnas/{symbol}/quote'
    s = Service(ChromeDriverManager().install())
    op = webdriver.ChromeOptions()
    op.headless = True
    driver = webdriver.Chrome(service=s, options=op)
    driver.get(url)

    # symbol, Last_Close, Market_Cap
    time.sleep(2)


    last_close = driver.find_element(by=By.XPATH,
                                         value='//*[@id="__layout"]/div/div[2]/div[3]/main/div[2]/div/div/div[1]/div[1]/div/sal-components/section/div/div/div/sal-components-quote/div/div/div/div/div/div[2]/ul/li[1]/div/div[2]')
    market_cap = driver.find_element(by=By.XPATH,
                                          value='//*[@id="__layout"]/div/div[2]/div[3]/main/div[2]/div/div/div[1]/div[1]/div/sal-components/section/div/div/div/sal-components-quote/div/div/div/div/div/div[2]/ul/li[7]/div/div[2]')
    return symbol, last_close.text, market_cap.text


for symbol in symbols:
    print(download_data(symbol))

The output looks like this:

('AAPL', '164.51', '2.6529 Tril')
('FB', '316.56', '843.3460 Bil')
('TSLA', '996.27', '947.9256 Bil')

The page for Disney actually doesn't exist so you might want to consider checking the url.

You can go through and save it as you want in a data frame to export to csv. I would suggest using Selenium instead of Beautiful Soup. You eliminate the headache of trying to find information that is dynamically rendered using Javascript... sometimes Beautiful Soup has trouble there. Selenium acts just like you would when you visit a webpage.

Also in your code, you try soup = BS(url). I believe you need to make a HTTP requests using the requests library in python, but I haven't used BS in a while.

Upvotes: 1

Mr. Me
Mr. Me

Reputation: 41

Developing a scraper to extract data from a website will be slower to react to live market conditions than using something closer to the original source of data. There are a variety of stock packages that are very useful. Here are some useful links to using Pandas DataReader, yfinance:

https://www.mssqltips.com/sqlservertip/6826/techniques-for-collecting-stock-data-with-python/ https://towardsdatascience.com/how-to-get-stock-data-using-python-c0de1df17e75

Personally I prefer to use Pandas as its been more reliable for me and all my data generally ends up in a panda's dataframe anyhow. DataReader can extract directly from Morningstar as well: https://pandas-datareader.readthedocs.io/en/v0.6.0/readers/morningstar.html

Additionally, Quandl is great for analyzing historic data if you are interested in developing an in-depth trading system. https://analyzingalpha.com/nasdaq-data-link-quandl-python-api

Upvotes: 1

Related Questions