Nickel
Nickel

Reputation: 590

Merge csv files with same name from multiple subfolders

I have csv files in folders i want to read and merge these files in one csv.

folder A have 2 subfolders B and C , and B and C have other subfolders and csv files are in last subfolders .

here is folders diagram: enter image description here

Upvotes: 0

Views: 2720

Answers (2)

zero
zero

Reputation: 1725

You can use os.walk. os.walk gives you a list of tuples, with the last one part of the tuple representing all the filenames in the current directory.

import os
path = os.path.join('path', 'to', 'directory')
files = [os.path.join(path,file) for dir, dir_name, file_list in os.walk(path) for file in file_list]

that convoluted list comprehension is basically just:

# unpack the tuple into dir, dir_name, file_list

files = []
for dir, dir_name, file_list in os.walk(path):
    for file in file_list:
        files.append(os.path.join(path,file))

and then just use pd.concat like so

import pandas as pd

combined_df = pd.concat([pd.read_csv(file) for file in files])

Upvotes: 1

Lambda
Lambda

Reputation: 1392

You can use glob and pandas.concat.

import glob
import pandas as pd

files = glob.glob("A/*/*/*.csv")
df = pd.concat([pd.read_csv(f) for f in files])

df.to_csv("merged.csv")

Upvotes: 2

Related Questions