Reputation: 11
I have a bunch of software output files that I have manipulated into csv-like text files. I have probably done this the hard way, because I am not too familiar with python library
The next step is to gather all this data in one single csv file. The files have different headers, or are sorted differently.
Lets say this is file A:
A | B | C | D | id
0 2 3 2 "A"
...
and this is file B:
B | A | Z | D | id
4 6 1 0 "B"
...
I want the append.csv file to look like:
A | B | C | D | Z | id
0 2 3 2 "A"
6 4 0 1 "B"
...
How can I do this, elegantly? Thank you for all answers.
Upvotes: 1
Views: 160
Reputation: 55599
The csv module in the standard library provides tools you can use to do this. The DictReader class produces a mapping of column name to value for each row in a csv file; the DictWriter class will write such mappings to a csv file.
DictWriter
must be provided with a list of column names, but does not require all column names to be present in each row mapping.
import csv
list_of_files = ['1.csv', '2.csv']
# Collect the column names.
all_headers = set()
for file_ in list_of_files:
with open(file_, newline='') as f:
reader = csv.reader(f)
headers = next(reader)
all_headers.update(headers)
all_headers = sorted(all_headers)
# Generate the output file.
with open('append.csv', 'w', newline='') as outfile:
writer = csv.DictWriter(outfile, fieldnames=all_headers)
writer.writeheader()
for file_ in list_of_files:
with open(file_, newline='') as f:
reader = csv.DictReader(f)
writer.writerows(reader)
$ cat append.csv
A,B,C,D,Z,id
0,2,3,2,,A
6,4,,0,1,B
Upvotes: 0
Reputation: 459
You can use pandas
to read CSV files into DataFrames and use the concat
method, then write the result to CSV:
import pandas as pd
df1 = pd.read_csv("file1.csv")
df2 = pd.read_csv("file2.csv")
df = pd.concat([df1, df2], axis=0, ignore_index=True)
df.to_csv("file.csv", index=False)
Upvotes: 1