Reputation: 126
trying to address a certain value in a data frame.
I got a CSV file that looks like this: enter image description here
I used the stock column as an index for a for loop. then I got historical price data from yahoo finance using selenium which I transformed into a data frame looks like this: enter image description here I want to search the stock date in the data frame I got from yahoo finance.
for stock_symbol in mystocks:
yahoo_url = "https://finance.yahoo.com/quote/"+stock_symbol+"/history?period1=0&period2=2597263000&interval=1d&filter=history&frequency=1d"
getting the data
bs_data = BeautifulSoup(source_data, features="lxml")
table = bs_data.find('table', attrs={"data-test": "historical-prices"})
dataf = pd.read_html(str(table))
I want to create a for loop that gets me the values from the wanted date row and puts it into a csv (stock name Date Open High Low Close* Adj Close** Volume).
the output I want is only a certain date from the CSV I got. for example in the CSV for CALM the date wanted is Apr 01, 2019 so I want to pull only this date data from the Yahoo data frame.
Upvotes: 1
Views: 85
Reputation: 2015
I am not an expert with BeautifulSoup, I took a stab at this by downloading the data in csv format to a default directory (chrome options) (there is a link to download)
This is by no means an answer to your question as you are using BeautifulSoup, but something you can consider.
import time
from pathlib import Path
import pandas as pd
from selenium.webdriver import Remote
from selenium.webdriver.chrome.options import Options
download_path = Path(r'C:\stackoverflow')
options = Options()
options.add_experimental_option("prefs", {
"download.default_directory": str(download_path),
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": True
})
driver = Remote(options=options)
stock_symbols = ['CALM', 'CTRA', 'NVGS', 'ANGO']
for stock_symbol in stock_symbols:
driver.get(f'https://finance.yahoo.com/quote/{stock_symbol}/history?period1=0&period2=2597263000&interval=1d&filter=history&frequency=1d')
time.sleep(5) # Replace with Webdriver Wait
download_data_link = driver.find_element_by_link_text('Download Data')
file_name = download_data_link.get_attribute('download')
download_data_link.click()
file_path = download_path / file_name
while True:
if file_path.exists():
break
df = pd.DataFrame.from_csv(file_path)
df['Stock Name'] = stock_symbol
print(df.head())
break
Output
Open High Low Close Adj Close Volume Stock Name
Date
1996-12-12 1.81250 1.8125 1.68750 1.703125 0.743409 1984400 CALM
1996-12-13 1.71875 1.8125 1.65625 1.781250 0.777510 996800 CALM
1996-12-16 1.81250 1.8125 1.71875 1.718750 0.750229 122000 CALM
1996-12-17 1.75000 1.8125 1.75000 1.773425 0.774094 239200 CALM
1996-12-18 1.81250 1.8125 1.75000 1.812500 0.791151 216400 CALM
filter by date
df = df.reset_index()
print(df.loc[df['Date'] == '1996-12-12'])
Date Open High Low Close Adj Close Volume Stock
Name
0 1996-12-12 1.8125 1.8125 1.6875 1.703125 0.743409 1984400 CALM
Upvotes: 1