MasayoMusic
MasayoMusic

Reputation: 614

Extracting option dates from yahoo finance

I am having trouble extracting the option dates from the dropdown menu from the following url"

url = 'https://finance.yahoo.com/quote/AAPL/options'

I have tried to locate proper tags via BeautifulSoup but it seems to be rendered using Javascript? I tried looking at network tab to see if there is some json data that is being utilized to populate the menu. I couldn't figure it out. Am I forced to use something selenium in this case? Selenium is heavyweight and slow.

Here is what I am attempting:

    url = 'https://finance.yahoo.com/quote/AAPL/options?p=AAPL'
    response = requests.get(url)
    with open('testing.html', 'wb') as f:
        f.write(response.content)
    soup = BeautifulSoup(response.content, 'html.parser')

Trying to capture various elements, but I can't seem to capture the option data

print(soup.find('div', {"class": "Cf Pt(18px) controls"}))
print(soup.find('select'))
dates =soup.find('div', class_ = "Fl(start) Pend(18px) option-contract-control drop-down-selector")
print(dates)

However I am mostly returning None. After saving the html to a file and opening it, it seems the option dropdown menu is missing, so it seems most likely I am unable to capture the javascript portion.

Upvotes: 1

Views: 1308

Answers (2)

demian-wolf
demian-wolf

Reputation: 1858

The best, the clearest and the most pythonic way to do what you want is to use the yfinance API (see the previous answer by @LoganGeorge).

However, if you want to do everything on your own, you may have at most three ways:

1) Getting JSON from Yahoo's API using requests (in this case, as well as in most others when API is available, this is a better way than scraping webpages), then converting it to Python dictionary using the json module and getting the necessary key.

You can try to find such requests in the browser's DevTools, and implement them with Python. Fortunately, there is an API. Moreover, it's open, so it's not necessary to specify Headers and Cookies to the request. But for the case you would need it -- see this question.

import datetime
import json

import requests


url = "https://query1.finance.yahoo.com/v7/finance/options/AAPL"
timestamps = json.loads(requests.get(url).content)["optionChain"]["result"][0]["expirationDates"]
dates = [datetime.date.fromtimestamp(timestamp) for timestamp in timestamps]

2) Getting the HTML page with requests and using BeautifulSoup to scrape necessary data (what you have tried).

Unfortunately, in this particular case, you can't scrape that dropdown selection box because it is generated on the client-side using JavaScript while requests just get the page from the server "as-is" without any client-side code being executed. The only way you can use scraping here is to download the ready page from the browser and pass it to the BeautifulSoup but it has no sense.

3) Use Selenium (note: generally, if you are using it not for testing purposes but for creating an API, your own app, and you need to do everything quickly and without opening any windows, it's a bad solution). But if there is no an API and the content is generated on the client-side, you need to do everything quickly and opening a browser window and installing webdriver additionally are not big problems for you, it may help you a lot).

Note: requests as well as BeautifulSoup and Selenium are not built-in Python packages. Don't forget to install them with pip install requests and pip install beautifulsoup4. For Selenium installation look here

Upvotes: 0

Logan George
Logan George

Reputation: 155

Yeah as Demian suggested you could use the yfinance python package. It looks like you'd do something like

import yfinance as yf
aapl = yf.Ticker("AAPL")

# show options expirations
aapl.options

Upvotes: 2

Related Questions