iOSecure
iOSecure

Reputation: 232

Automatically create dataframe for each folder

Each folder has a csv for each month of the year(1.csv,2.csv,3.csv etc) and the script creates a dataframe combining the 9th column for all 12 csv's into an xlsx sheet named concentrated.xlsx. It works but only for one directory at a time

files = glob['2014/*.csv']
sorted_files = natsorted(files)

def read_9th(fn):
    return pd.read_csv(fn, usecols=[9], names=headers)

big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl') 
big_df.to_excel(writer,'2014')

writer.save()

Is it possible to create a dataframe automatically for each directory without having to manually create one for each folder like this:

files14 = glob['2014/*.csv']
files15 = glob['2015/*.csv']

sorted_files14 = natsorted(files14)
sorted_files15 = natsorted(files15)

def read_9th(fn):
    return pd.read_csv(fn, usecols=[9], names=headers)

big_df = pd.concat([read_9th(fn) for fn in sorted_files14], axis=1)
big_df1 = pd.concat([read_9th(fn) for fn in sorted_files15], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl') 
big_df.to_excel(writer,'2014')
big_df1.to_excel(writer,'2015')

writer.save()

Upvotes: 1

Views: 167

Answers (1)

Mr.F
Mr.F

Reputation: 936

If you get a list of the folders that want to process, e.g.

folders = os.listdir('.')
# or 
folders = ['2014', '2015', '2016']

You could do something like:

writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
for folder in folders:
    files = glob('%s/*.csv' % folder)
    sorted_files = natsorted(files)

    big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1) 
    big_df.to_excel(writer, folder)

writer.save()

Upvotes: 1

Related Questions