Reputation: 614
I am having trouble extracting the option dates from the dropdown menu from the following url"
url = 'https://finance.yahoo.com/quote/AAPL/options'
I have tried to locate proper tags via BeautifulSoup
but it seems to be rendered using Javascript?
I tried looking at network
tab to see if there is some json data that is being utilized to populate the menu. I couldn't figure it out. Am I forced to use something selenium in this case?
Selenium is heavyweight and slow.
Here is what I am attempting:
url = 'https://finance.yahoo.com/quote/AAPL/options?p=AAPL'
response = requests.get(url)
with open('testing.html', 'wb') as f:
f.write(response.content)
soup = BeautifulSoup(response.content, 'html.parser')
Trying to capture various elements, but I can't seem to capture the option data
print(soup.find('div', {"class": "Cf Pt(18px) controls"}))
print(soup.find('select'))
dates =soup.find('div', class_ = "Fl(start) Pend(18px) option-contract-control drop-down-selector")
print(dates)
However I am mostly returning None
. After saving the html
to a file and opening it, it seems
the option dropdown menu is missing, so it seems most likely I am unable to capture the javascript
portion.
Upvotes: 1
Views: 1308
Reputation: 1858
The best, the clearest and the most pythonic way to do what you want is to use the yfinance API (see the previous answer by @LoganGeorge).
However, if you want to do everything on your own, you may have at most three ways:
1) Getting JSON from Yahoo's API using requests
(in this case, as well as in most others when API is available, this is a better way than scraping webpages), then converting it to Python dictionary using the json
module and getting the necessary key.
You can try to find such requests in the browser's DevTools, and implement them with Python. Fortunately, there is an API. Moreover, it's open, so it's not necessary to specify Headers and Cookies to the request. But for the case you would need it -- see this question.
import datetime
import json
import requests
url = "https://query1.finance.yahoo.com/v7/finance/options/AAPL"
timestamps = json.loads(requests.get(url).content)["optionChain"]["result"][0]["expirationDates"]
dates = [datetime.date.fromtimestamp(timestamp) for timestamp in timestamps]
2) Getting the HTML page with requests
and using BeautifulSoup
to scrape necessary data (what you have tried).
Unfortunately, in this particular case, you can't scrape that dropdown selection box because it is generated on the client-side using JavaScript while requests
just get the page from the server "as-is" without any client-side code being executed. The only way you can use scraping here is to download the ready page from the browser and pass it to the BeautifulSoup but it has no sense.
3) Use Selenium (note: generally, if you are using it not for testing purposes but for creating an API, your own app, and you need to do everything quickly and without opening any windows, it's a bad solution). But if there is no an API and the content is generated on the client-side, you need to do everything quickly and opening a browser window and installing webdriver additionally are not big problems for you, it may help you a lot).
Note: requests as well as BeautifulSoup and Selenium are not built-in Python packages. Don't forget to install them with pip install requests
and pip install beautifulsoup4
. For Selenium installation look here
Upvotes: 0
Reputation: 155
Yeah as Demian suggested you could use the yfinance python package. It looks like you'd do something like
import yfinance as yf
aapl = yf.Ticker("AAPL")
# show options expirations
aapl.options
Upvotes: 2