Reputation: 37
Python newbie here.
I have a bunch of CSV files of stock prices. They are structured as date, opening, high, low, close, adj. close, volume. I only need the date (column 1), the opening (column 2) and close (column 5).
The task prohibits the use of pandas. How can I extract those three columns and return them?
This is my attempt, but that doesn't extract the 3 columns I need.
def read_daily_prices(daily_price):
daily_prices = []
with open(stocks_symbols + ".csv", "r", newline='') as csvfile:
csvreader = csv.DictReader(csvfile, delimiter=',')
for daily_price in csvreader:
daily_prices.append(daily_price)
Upvotes: 0
Views: 659
Reputation: 359
Hi👋🏻 Hope you are doing well!
If I understood your question correctly, you can do something like that:
import csv
path = "stocks.csv"
# stocks.csv
# date, opening, high, low, close, adj_close, volume
# 2022-01-01, opening_1, high_1, low_1, close_1, adj_close_1, volume_1
# 2022-01-02, opening_2, high_2, low_2, close_2, adj_close_2, volume_2
columns = ["date", "opening", "close"]
daily_prices = []
with open(path, newline="") as csvfile:
reader = csv.DictReader(csvfile, delimiter=",")
for row in reader:
daily_prices.append([row["date"], row["opening"], row["close"]])
print(daily_prices)
# [['2022-01-01', 'opening_1', 'close_1'], ['2022-01-02', 'opening_2', 'close_2']]
daily_prices_dicts = [dict(zip(columns, row)) for row in daily_prices]
print(daily_prices_dicts)
# [{'date': '2022-01-01', 'opening': 'opening_1', 'close': 'close_1'}, {'date': '2022-01-02', 'opening': 'opening_2', 'close': 'close_2'}]
the main idea is: csv.DictReader
returns list of dicts so you can select necessary columns by key. 🙂
Upvotes: 1
Reputation: 4098
Not sure about your reasons to not use pandas; but if your activity involves processing a 2-D dataframe, then it will be much easier with pandas/numpy.
If you want to use plain python APIs, then you can use built-in csv and access columns with indexes. Extract the required columns, create a new sub-list and append to result list.
import csv
def read_daily_prices():
daily_prices = []
with open("/content/sample_data/test.csv", "r", newline="") as csvfile:
csvreader = csv.reader(csvfile, delimiter=",")
for daily_price in csvreader:
daily_prices.append([daily_price[0], daily_price[1], daily_price[4]])
return daily_prices
#
read_daily_prices()
Sample Input:
date,opening,high,low,close,adjclose,volume
2022-12-09,100,101,99,99.5,99.5,10000000
2022-12-09,200,202,199,199,199.1,20000000
2022-12-09,300,303,299,299,299.2,30000000
Generated Output:
[['date', 'opening', 'close'],
['2022-12-09', '100', '99.5'],
['2022-12-09', '200', '199'],
['2022-12-09', '300', '299']]
Upvotes: 1