Reputation: 232
Each folder has a csv for each month of the year(1.csv,2.csv,3.csv etc) and the script creates a dataframe combining the 9th column for all 12 csv's into an xlsx sheet named concentrated.xlsx
. It works but only for one directory at a time
files = glob['2014/*.csv']
sorted_files = natsorted(files)
def read_9th(fn):
return pd.read_csv(fn, usecols=[9], names=headers)
big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
big_df.to_excel(writer,'2014')
writer.save()
Is it possible to create a dataframe automatically for each directory without having to manually create one for each folder like this:
files14 = glob['2014/*.csv']
files15 = glob['2015/*.csv']
sorted_files14 = natsorted(files14)
sorted_files15 = natsorted(files15)
def read_9th(fn):
return pd.read_csv(fn, usecols=[9], names=headers)
big_df = pd.concat([read_9th(fn) for fn in sorted_files14], axis=1)
big_df1 = pd.concat([read_9th(fn) for fn in sorted_files15], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
big_df.to_excel(writer,'2014')
big_df1.to_excel(writer,'2015')
writer.save()
Upvotes: 1
Views: 167
Reputation: 936
If you get a list of the folders that want to process, e.g.
folders = os.listdir('.')
# or
folders = ['2014', '2015', '2016']
You could do something like:
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
for folder in folders:
files = glob('%s/*.csv' % folder)
sorted_files = natsorted(files)
big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1)
big_df.to_excel(writer, folder)
writer.save()
Upvotes: 1