Reputation: 45
I have about 5600 directories structured as follows:
I need to merge all A files into one file, all B files into another file, and so on.
How can I do this?
Upvotes: 0
Views: 412
Reputation: 4827
IIUC, this should work for your case (I used a RootDir
with 2 subdirectories Dir1
and Dir2
with in each 2 files A.csv
and B.csv
). You can change the value of rootdir
to match your usecase:
import os
import pandas as pd
rootdir = 'RootDir' # Change when needed to your root directory
files = [os.path.join(dp, f) for dp, dn, filenames in os.walk(rootdir) for f in filenames if os.path.splitext(f)[1] == '.csv']
names = set([x.rstrip('.csv').split('/')[-1] for x in files])
df_dict = {key: pd.DataFrame() for key in names}
for file in files:
key = file.rstrip('.csv').split('/')[-1]
df = pd.read_csv(file)
df_dict[key] = pd.concat([df_dict[key], df])
Output is a dictionary of dataframes df_dict
with A
and B
as keys.
Use df_dict['A']
to access DataFrame A
and so on...
Upvotes: 1