Reputation: 448
I can combined 2 csv scripts and it works well.
import pandas
csv1=pandas.read_csv('1.csv')
csv2=pandas.read_csv('2.csv')
merged=csv1.merge(csv2,on='field1')
merged.to_csv('output.csv',index=False)
Now, I would like to combine more than 2 csvs using the same method as above. I have list of CSV which I defined to something like this
import pandas
collection=['1.csv','2.csv','3.csv','4.csv']
for i in collection:
csv=pandas.read_csv(i)
merged=csv.merge(??,on='field1')
merged.to_csv('output2.csv',index=False)
I havent got it work so far if more than 1 csv..I guess it just a matter iterate inside the list ..any idea?
Upvotes: 0
Views: 3309
Reputation: 328810
You need special handling for the first loop iteration:
import pandas
collection=['1.csv','2.csv','3.csv','4.csv']
result = None
for i in collection:
csv=pandas.read_csv(i)
if result is None:
result = csv
else:
result = result.merge(csv, on='field1')
if result:
result.to_csv('output2.csv',index=False)
Another alternative would be to load the first CSV outside the loop but this breaks when the collection is empty:
import pandas
collection=['1.csv','2.csv','3.csv','4.csv']
result = pandas.read_csv(collection[0])
for i in collection[1:]:
csv = pandas.read_csv(i)
result = result.merge(csv, on='field1')
if result:
result.to_csv('output2.csv',index=False)
I don't know how to create an empty document (?) in pandas but that would work, too:
import pandas
collection=['1.csv','2.csv','3.csv','4.csv']
result = pandas.create_empty() # not sure how to do this
for i in collection:
csv = pandas.read_csv(i)
result = result.merge(csv, on='field1')
result.to_csv('output2.csv',index=False)
Upvotes: 1