bikoman57
bikoman57

Reputation: 126

addressing a value in data frame next to a certain value

trying to address a certain value in a data frame.

I got a CSV file that looks like this: enter image description here

I used the stock column as an index for a for loop. then I got historical price data from yahoo finance using selenium which I transformed into a data frame looks like this: enter image description here I want to search the stock date in the data frame I got from yahoo finance.

for stock_symbol in mystocks:
yahoo_url = "https://finance.yahoo.com/quote/"+stock_symbol+"/history?period1=0&period2=2597263000&interval=1d&filter=history&frequency=1d"

getting the data

bs_data = BeautifulSoup(source_data, features="lxml")
table = bs_data.find('table', attrs={"data-test": "historical-prices"})
dataf = pd.read_html(str(table))

I want to create a for loop that gets me the values from the wanted date row and puts it into a csv (stock name Date Open High Low Close* Adj Close** Volume).

the output I want is only a certain date from the CSV I got. for example in the CSV for CALM the date wanted is Apr 01, 2019 so I want to pull only this date data from the Yahoo data frame.

Upvotes: 1

Views: 85

Answers (1)

Satish Michael
Satish Michael

Reputation: 2015

I am not an expert with BeautifulSoup, I took a stab at this by downloading the data in csv format to a default directory (chrome options) (there is a link to download)

This is by no means an answer to your question as you are using BeautifulSoup, but something you can consider.

import time
from pathlib import Path

import pandas as pd
from selenium.webdriver import Remote
from selenium.webdriver.chrome.options import Options

download_path = Path(r'C:\stackoverflow')

options = Options()
options.add_experimental_option("prefs", {
    "download.default_directory": str(download_path),
    "download.prompt_for_download": False,
    "download.directory_upgrade": True,
    "safebrowsing.enabled": True
})

driver = Remote(options=options)
stock_symbols = ['CALM', 'CTRA', 'NVGS', 'ANGO']

for stock_symbol in stock_symbols:
    driver.get(f'https://finance.yahoo.com/quote/{stock_symbol}/history?period1=0&period2=2597263000&interval=1d&filter=history&frequency=1d')
    time.sleep(5) # Replace with Webdriver Wait

    download_data_link = driver.find_element_by_link_text('Download Data')
    file_name = download_data_link.get_attribute('download')
    download_data_link.click()

    file_path = download_path / file_name
    while True:
        if file_path.exists():
            break

    df = pd.DataFrame.from_csv(file_path)
    df['Stock Name'] = stock_symbol
    print(df.head())

    break

Output

           Open    High      Low     Close  Adj Close   Volume Stock Name
 Date                                                                         
 1996-12-12  1.81250  1.8125  1.68750  1.703125   0.743409  1984400       CALM
 1996-12-13  1.71875  1.8125  1.65625  1.781250   0.777510   996800       CALM
 1996-12-16  1.81250  1.8125  1.71875  1.718750   0.750229   122000       CALM
 1996-12-17  1.75000  1.8125  1.75000  1.773425   0.774094   239200       CALM
 1996-12-18  1.81250  1.8125  1.75000  1.812500   0.791151   216400       CALM

filter by date

df = df.reset_index()
print(df.loc[df['Date'] == '1996-12-12'])

        Date    Open    High     Low     Close  Adj Close   Volume Stock 
Name
0 1996-12-12  1.8125  1.8125  1.6875  1.703125   0.743409  1984400       CALM

Upvotes: 1

Related Questions