Reputation: 149
I have 13 csv files to merge. I wanted to try pandas and python but I am struggling.
There is 3 types of files the key is a 1) has columns a b c d 2) has columns a b c d (with a not containing any from 1) 3) has columns a b c d e f g (with a containing all from 1 and 2)
How could i go about merging these all into one csv containing all the info from all the files?
Upvotes: 0
Views: 1702
Reputation: 5362
You should do an outer merge as follows, making use of the built-in reduce method:
files = ['file1.csv', 'file2.csv', ...] # the 13 files
dataframes = [ pandas.read_csv( f ) for f in files ] # add arguments as necessary to the read_csv method
merged = reduce(lambda left,right: pandas.merge(left,right,on='a', how='outer'), dataframes)
Upvotes: 2
Reputation: 51
Hard to write it exactly without seeing example data. But this should get you started.
import pandas as pd
df = pd.read_csv('file1.csv')
df = df.append(pd.read_csv('file2.csv')) #this one adds more rows to the dataframe
df = df.join(pd.read_csv('file3.csv'), on=[a,b,c,d], how='left') # this one will add columns if they match data
Upvotes: 0