jack
jack

Reputation: 861

Combine multiple CSV files using Python and Pandas

I have the following code:

import glob
import pandas as pd
allFiles = glob.glob("C:\*.csv")
frame = pd.DataFrame()
list_ = []
for file_ in allFiles:
    print file_
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
    frame = pd.concat(list_, sort=False)
print list_
frame.to_csv("C:\f.csv")

This combines multiple CSVs to single CSV.

However it also adds a row number column.

Input:

a.csv

a   b   c   d
1   2   3   4

b.csv

a   b   c   d
551 55  55  55
551 55  55  55

result: f.csv

    a   b   c   d
0   1   2   3   4
0   551 55  55  55
1   551 55  55  55

How can I modify the code not to show the row numbers in the output file?

Upvotes: 0

Views: 78

Answers (2)

nosklo
nosklo

Reputation: 222852

You don't have to use pandas for this simple task. pandas is parsing the file and converting the data to numpy constructs, which you don't need... In fact you can do it with just normal text file manipulation:

import glob
allFiles = glob.glob("C:\*.csv")
first = True
with open('C:\f.csv', 'w') as fw:
    for filename in allFiles:
        print filename
        with open(filename, 'r') as f:
            if not first:
                f.readline() # skip header
            first = False
            fw.writelines(f)

Upvotes: 1

Bera
Bera

Reputation: 1949

Change frame.to_csv("C:\f.csv") to frame.to_csv("C:\f.csv", index=False)

See: pandas.DataFrame.to_csv

Upvotes: 2

Related Questions