FRizal
FRizal

Reputation: 448

Python for loop to read csv using pandas

I can combined 2 csv scripts and it works well.

import pandas

csv1=pandas.read_csv('1.csv')
csv2=pandas.read_csv('2.csv')
merged=csv1.merge(csv2,on='field1')
merged.to_csv('output.csv',index=False)

Now, I would like to combine more than 2 csvs using the same method as above. I have list of CSV which I defined to something like this

import pandas
collection=['1.csv','2.csv','3.csv','4.csv']
for i in collection:
  csv=pandas.read_csv(i)
  merged=csv.merge(??,on='field1')
  merged.to_csv('output2.csv',index=False)

I havent got it work so far if more than 1 csv..I guess it just a matter iterate inside the list ..any idea?

Upvotes: 0

Views: 3309

Answers (1)

Aaron Digulla
Aaron Digulla

Reputation: 328810

You need special handling for the first loop iteration:

import pandas
collection=['1.csv','2.csv','3.csv','4.csv']

result = None
for i in collection:
  csv=pandas.read_csv(i)
  if result is None:
    result = csv
  else:
    result = result.merge(csv, on='field1')

if result:
  result.to_csv('output2.csv',index=False)

Another alternative would be to load the first CSV outside the loop but this breaks when the collection is empty:

import pandas
collection=['1.csv','2.csv','3.csv','4.csv']

result = pandas.read_csv(collection[0])
for i in collection[1:]:
  csv = pandas.read_csv(i)
  result = result.merge(csv, on='field1')

if result:
  result.to_csv('output2.csv',index=False)

I don't know how to create an empty document (?) in pandas but that would work, too:

import pandas
collection=['1.csv','2.csv','3.csv','4.csv']

result = pandas.create_empty() # not sure how to do this
for i in collection:
  csv = pandas.read_csv(i)
  result = result.merge(csv, on='field1')

result.to_csv('output2.csv',index=False)

Upvotes: 1

Related Questions