Reputation: 680

How to merge multiple csv files?

I have several csv files that has same first row element in it. For example:

csv-1.csv:
Value,0
Currency,0
datetime,0
Receiver,0
Beneficiary,0
Flag,0
idx,0

csv-2.csv:
Value,0
Currency,1
datetime,0
Receiver,0
Beneficiary,0
Flag,0
idx,0

And with these files (more than 2 files by the way) I want to merge them and create something like that:

left	csv-1	csv-2
Value	0	0
Currency	0	1
datetime	0	0

How can I create this funtion in python?

Upvotes: 1

Answers (3)

imxitiz

Reputation: 3987

Here's what you can do

import pandas as pd
from glob import glob

def refineFilename(path):
    return path.split(".")[0]

df=pd.DataFrame()

for file in glob("csv-*.csv"):
    new=pd.read_csv(file,header=None,index_col=[0])
    df[refineFinename(file)]=new[1]

df.reset_index(inplace=True)
df.rename(columns={0:"left"},inplace=True)

print(df)

"""
          left  csv-1  csv-2
0        Value      0      0
1     Currency      0      1
2     datetime      0      0
3     Receiver      0      0
4  Beneficiary      0      0
5         Flag      0      0
6          idx      0      0
"""

What we are doing here is making the df variable store all data, and then iterating through all files and adding a second column of those files to df with file name as the column name.

Upvotes: 1

Corralien

Reputation: 120559

You can use:

import pandas as pd
import pathlib

out = (pd.concat([pd.read_csv(csvfile, header=None, index_col=[0], names=[csvfile.stem])
           for csvfile in sorted(pathlib.Path.cwd().glob('*.csv'))], axis=1)
         .rename_axis('left').reset_index())

Output:

>>> out
          left  csv-1  csv-2
0        Value      0      0
1     Currency      0      1
2     datetime      0      0
3     Receiver      0      0
4  Beneficiary      0      0
5         Flag      0      0
6          idx      0      0

Upvotes: 1

angwrk

Reputation: 420

First, you must create indexes in dataframes by first columns, on which you will further join:

import pandas as pd
import numpy as np

df1 = pd.read_csv('csv-1.csv')
df2 = pd.read_csv('csv-2.csv')

df1 = df1.set_index('col1')
df2 = df2.set_index('col1')

df = df1.join(df2, how='outer')

Then rename the column names if needed, or make a new index

Upvotes: 2

How to merge multiple csv files?

Answers (3)

Related Questions