Reputation: 3640
I'd like to scrape every treasury yield rate that is available on treasury.gov website.
How would I go about taking this information? I'm assuming that I'd have to use BeautifulSoup or Selenium or something like that (preferably BS4). I'd eventually like to put this data in a Pandas DataFrame.
Upvotes: 2
Views: 4851
Reputation: 11
A different method below to download interest rates after changes to CSV downloads for the "all" time period implemented on June 2, 2022. You could use the historical archive option to download historical data and run this code to update the data.
import pandas as pd
import requests
csv_url = 'https://home.treasury.gov/resource-center/data-chart-center/interest-rates/daily-treasury-rates.csv/2022/all?field_tdr_date_value=2022&type=daily_treasury_yield_curve&page&_format=csv'
req = requests.get(csv_url, verify=False)
url_content = req.content
csv_file = open('2022_rates.csv', 'wb')
csv_file.write(url_content)
csv_file.close()
rates_2022 = pd.read_csv('2022_rates.csv')
rates_2022
Upvotes: 1
Reputation: 2775
Here's one way you can grab the data in a table using requests and beautifulsoup
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yieldAll'
r = requests.get(url)
html = r.text
soup = BeautifulSoup(html)
table = soup.find('table', {"class": "t-chart"})
rows = table.find_all('tr')
data = []
for row in rows[1:]:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele])
result = pd.DataFrame(data, columns=['Date', '1 Mo', '2 Mo', '3 Mo', '6 Mo', '1 Yr', '2 Yr', '3 Yr', '5 Yr', '7 Yr', '10 Yr', '20 Yr', '30 Yr'])
print(result)
Upvotes: 5