Grace Rich
Grace Rich

Reputation: 33

How do I append multiple CSV files using Pandas data structures in Python

I have about 10 CSV files that I'd like to append into one file. My thought was to assign the file names to numbered data_files, and then append them in a while loop, but I'm having trouble updating the file to the next numbered date_file in my loop. I keep getting errors related to "data_file does not exist" and "cannot concatenate 'str' and 'int' objects". I'm not even sure if this is a realistic approach to my problem. Any help would be appreciated.

import pandas as pd

path = '//pathname'
data_file1= path + 'filename1.csv'
data_file2= path + 'filename2.csv'
data_file3= path + 'filename3.csv'
data_file4= path + 'filename4.csv'
data_file5= path + 'filename5.csv'
data_file6= path + 'filename6.csv'
data_file7= path + 'filename7.csv'

df = pd.read_csv(data_file1)

x = 2
while x < 8:
     data_file = 'data file' + str(x)
     tmdDF = pd.read_csv(data_file)
     df = df.append(tmpDF)
     x += x + 1

Upvotes: 3

Views: 4397

Answers (2)

Paulo Almeida
Paulo Almeida

Reputation: 8061

You can use fileinput for this:

import fileinput

path = '//pathname'
files = [path + 'filename' + str(i) + '.csv' for i in range(1,8)]

with open('output.csv', 'w') as output, fileinput.input(files) as fh:
    for line in fh:
        if fileinput.isfirstline() and fileinput.lineno() != 1:
            continue
        output.write(line)  

Upvotes: 1

Isaac Drachman
Isaac Drachman

Reputation: 994

Not quite sure what you're doing in terms of constructing that string data_file within the loop. You can't address variables using a string of their name. Also as noted by Paulo, you're not incrementing the indices correctly either. Try the following code but note that for the purposes of merely concatenating csv files, you certainly do not need pandas.

import pandas
filenames = ["filename1.csv", "filename2.csv", ...] # Fill in remaining files.
df = pandas.DataFrame()
for filename in filenames:
    df = df.append(pandas.read_csv(filename))
# df is now a dataframe of all the csv's in filenames appended together

Upvotes: 6

Related Questions