Reputation: 115
I have a set of CSVs in a folder that I am trying to loop through for my pandas script. I am using glob to select the files ending in .csv but it just returns the same .csv file every time.
I am trying to accomplish the following:
Basically, input the .csv file into the script, save the filename as a variable, run the rest of the script, and repeat until complete.
I am using Jupyter Notebook on MacOS
Here is my current code:
import yfinance as yf
import matplotlib
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import mplfinance as mpf
import glob
path = r'/Users/chris/Desktop/Files'
files = glob.glob(path + "/*.csv")
for f in files:
dfb = pd.read_csv(f,usecols=['Time','Balance'],index_col=0, parse_dates=True)
photoname = files+'.png'
dfb["Balance"] = dfb["Balance"].str.split(expand=True).iloc[:,0]
dfb["Balance"] = dfb["Balance"].str.replace(',','').astype(float)
df = yf.Ticker("DOGE-USD").history(period='max')
df = df.loc["2021-01-01":]
newdfb = dfb['Balance'].resample('D').ohlc().dropna()
newdfb.drop(['open','high','low'],axis=1,inplace=True)
newdfb.columns = ['Balance']
dates = [d.date() for d in newdfb.index]
newdfb.index = pd.DatetimeIndex(dates)
newdfb.index.name = 'Time'
dfc = df.join(newdfb, how='outer').dropna()
dfc.index.name = 'Date'
ap = mpf.make_addplot(dfc['Balance'])
mpf.plot(dfc,type='candle',addplot=ap)
print(address)
mpf.plot(dfc,type='candle',addplot=ap, savefig=photoname) #This saves as a photo
Upvotes: 0
Views: 4529
Reputation: 115
The issue here was the lines following read_csv not being indented, thus not being in the for f in file: loop
. After indenting the lines underneath read_csv, the code runs as it should.
Since the data in df = yf.Ticker("DOGE-USD").history(period='max')
and df = df.loc["2021-01-01":]
is static, moving it above the for loop is more efficient because this way it is only called once.
Here is the solution code:
import yfinance as yf
import matplotlib
from matplotlib import pyplot as plt
import pandas as pd
import mplfinance as mpf
import glob
path = r'/Users/chris/Desktop/Files'
files = glob.glob(path + "/*.csv")
df = yf.Ticker("DOGE-USD").history(period='max')
df = df.loc["2021-01-01":]
for f in files:
dfb = pd.read_csv(f,usecols=['Time','Balance'],index_col=0,
parse_dates=True)
photoname = files+'.png'
dfb["Balance"] = dfb["Balance"].str.split(expand=True).iloc[:,0]
dfb["Balance"] = dfb["Balance"].str.replace(',','').astype(float)
newdfb = dfb['Balance'].resample('D').ohlc().dropna()
newdfb.drop(['open','high','low'],axis=1,inplace=True)
newdfb.columns = ['Balance']
dates = [d.date() for d in newdfb.index]
newdfb.index = pd.DatetimeIndex(dates)
newdfb.index.name = 'Time'
dfc = df.join(newdfb, how='outer').dropna()
dfc.index.name = 'Date'
ap = mpf.make_addplot(dfc['Balance'])
mpf.plot(dfc,type='candle',addplot=ap)
mpf.plot(dfc,type='candle',addplot=ap, savefig=photoname)
Thank you to @Nathan Mills and @Daniel Goldfarb for providing the solution in the original post comments.
Upvotes: 1