Reputation: 193
I am working on a project which needs finance data, I need to scrape historical data from yahoo finance,but for example https://finance.yahoo.com/quote/ETH-USD/history?p=ETH-USD in that page, I need to adjust time interval and press download button, how can I do it with python ? I should automate this task.
Sorry for my grammatical mistakes,my native language is not English.
Upvotes: 1
Views: 177
Reputation: 1292
You could use a Selenium WebDriver to load the page, WebElement containing the download button and click() it but that would be a slow and brittle solution compared to calling the API directly.
My approach to this problem would be to reverse engineer the Yahoo Finance URL and fetch the data with the Requests library. The result is a CSV with the historical data that you're looking for.
If you look at the download URL... the URL query parameters are fairly intuitive to understand.
https://query1.finance.yahoo.com/v7/finance/download/ETH-USD?period1=1581795382&period2=1613417782&interval=1d&events=history&includeAdjustedClose=true
We can see that the key components to modify are the stock ticker, date range, and interval. In code...
import csv
from datetime import datetime, timedelta
from io import StringIO
import requests
ticker = 'ETH-USD'
url = f'https://query1.finance.yahoo.com/v7/finance/download/{ticker}'
now = datetime.now()
start_ts = int((now - timedelta(days=365)).timestamp())
end_ts = int(now.timestamp())
params = {
'period1': start_ts,
'period2': end_ts,
'interval': '1d',
'events': 'history',
'includeAdjustedClose': True,
}
result = requests.get(url, params=params)
f = StringIO(result.content.decode('utf-8'))
reader = csv.reader(f, delimiter=',')
for row in reader:
print('\t'.join(row))
Upvotes: 0
Reputation: 266
In order for you to extract the data from yahoo finance, you can use a python library called yfinance
In your case, by using this library you would do this:
import yfinance as yf
tickers = yf.Tickers('ETH')
eth_history = tickers.tickers.ETH.history(period="1y")
And then you would do whatever you want with this data (save in a spreadsheet for example).
Upvotes: 1
Reputation: 186
You can use a library that implements the Chrome DevTools Protocol (CDP) to automate the Chrome browser or a headless Chromium browser (or any browser supporting this protocol).
Here is one library I found by searching: https://github.com/hyperiongray/trio-chrome-devtools-protocol, but I'm sure there are others too. I have not used it personally.
Upvotes: 0