blueman010112
blueman010112

Reputation: 476

Merge same columns in two different csv using python

I have several csv files

csv1:

date A B C
1991 1 2 3

csv2:

date A B C
1991 4 5 6

what I need is

csv3:

date A_csv1 A_csv2
1991    1      4

csv4:

date B_csv1 B_csv2
1991    2      5

csv5:

date C_csv1 C_csv2
1991    3      6 

how to do it?

I am using python and pandas

but almost all search results is pandas.merge that is not the result I need...

Upvotes: 1

Views: 68

Answers (2)

jezrael
jezrael

Reputation: 862681

Use concat with keys and axis=1 parameters:

df1 = pd.read_csv(file1, index_col=['date'])
df2 = pd.read_csv(file2, index_col=['date'])

df = pd.concat([df1, df2], axis=1, keys=('csv1','csv2'))
print (df)
     csv1       csv2      
        A  B  C    A  B  C
date                      
1991    1  2  3    4  5  6

And then iterate by second level of MultiIndex with flatten columns and write to files:

for i, g in df.groupby(level=1, axis=1):
    g.columns = [f'{b}_{a}' for a, b in g.columns]
    g.to_csv(f'{i}.csv')
    print (g)

      A_csv1  A_csv2
date                
1991       1       4
      B_csv1  B_csv2
date                
1991       2       5
      C_csv1  C_csv2
date                
1991       3       6

If files are in same folder:

import glob
files = glob.glob('files/*.csv')

df = pd.concat([pd.read_csv(f) for f in files], axis=1, keys=files)
print (df)

Upvotes: 1

Colton Neary
Colton Neary

Reputation: 90

I always use pandas concatenation function:

df=pd.read_csv(“csv1.csv”)
do=pd.read_csv(“csv2.csv”)

DF = pd.concat([df , do ] , axis = 1)

You can try changing axis and other parameters to get it to concat correctly.

Upvotes: 0

Related Questions