Reputation: 1370
For example, my csv has columns as below:
ID, ID2, Date, Job No, Code
I need to write the columns back in the same order. The dict
jumbles the order immediately, so I believe it's more of a problem with the reader.
Upvotes: 42
Views: 32624
Reputation: 11
I wrote a little tool to sort the order of CSV columns: I don't claim that it's great I know little of Python, but it does the job:
import csv
import sys
with open(sys.argv[1], 'r') as infile:
csvReader = csv.DictReader(infile)
sorted_fieldnames = sorted(csvReader.fieldnames)
writer = csv.DictWriter(sys.stdout, fieldnames=sorted_fieldnames)
# reorder the header first
writer.writeheader()
for row in csvReader:
# writes the reordered rows to the new file
writer.writerow(row)
Upvotes: 0
Reputation: 14865
Edit: as of python 3.6 dicts are ordered by insertion order, essentially making all dicts in python OrderedDicts by default. That being said the docs say dont rely on this behaviour because it may change. I will challenge that, lets see if it ever changes back :)
Unfortunatley the default DictReader does not allow for overriding the dict class, a custom DictReader would do the trick though
import csv
class DictReader(csv.DictReader):
def __init__(self, *args, **kwargs):
self.dict_class = kwargs.pop(dict_class, dict)
super(DictReader, self).__init__(*args, **kwargs)
def __next__(self):
''' copied from python source '''
if self.line_num == 0:
# Used only for its side effect.
self.fieldnames
row = next(self.reader)
self.line_num = self.reader.line_num
# unlike the basic reader, we prefer not to return blanks,
# because we will typically wind up with a dict full of None
# values
while row == []:
row = next(self.reader)
# using the customized dict_class
d = self.dict_class(zip(self.fieldnames, row))
lf = len(self.fieldnames)
lr = len(row)
if lf < lr:
d[self.restkey] = row[lf:]
elif lf > lr:
for key in self.fieldnames[lr:]:
d[key] = self.restval
return d
use it like so
import collections
csv_reader = DictReader(f, dict_class=collections.OrderedDict)
# ...
Upvotes: 2
Reputation: 882281
Python's dict
s do NOT maintain order prior to 3.6 (but, regardless, in that version the csv.DictReader
class was modified to return OrderedDict
s).
However, the instance of csv.DictReader
that you're using (after you've read the first row!-) does have a .fieldnames
list of strings, which IS in order.
So,
for rowdict in myReader:
print ['%s:%s' % (f, rowdict[f]) for f in myReader.fieldnames]
will show you that the order is indeed maintained (in .fieldnames
of course, NEVER in the dict
-- that's intrinsically impossible in Python!-).
So, suppose you want to read a.csv
and write b.csv
with the same column order. Using plain reader and writer is too easy, so you want to use the Dict varieties instead;-). Well, one way is...:
import csv
a = open('a.csv', 'r')
b = open('b.csv', 'w')
ra = csv.DictReader(a)
wb = csv.DictWriter(b, None)
for d in ra:
if wb.fieldnames is None:
# initialize and write b's headers
dh = dict((h, h) for h in ra.fieldnames)
wb.fieldnames = ra.fieldnames
wb.writerow(dh)
wb.writerow(d)
b.close()
a.close()
assuming you have headers in a.csv
(otherewise you can't use a DictReader on it) and want just the same headers in b.csv
.
Upvotes: 66
Reputation: 4855
Make an OrderedDict
from each row dict
sorted by DictReader.fieldnames
.
import csv
from collections import OrderedDict
reader = csv.DictReader(open("file.csv"))
for row in reader:
sorted_row = OrderedDict(sorted(row.items(),
key=lambda item: reader.fieldnames.index(item[0])))
Upvotes: 10
Reputation: 125
I know this question is old...but if you use DictReader
, you can pass it an ordered list with the fieldnames to the fieldnames
param
Upvotes: 2
Reputation: 3177
from csv import DictReader, DictWriter
with open("input.csv", 'r') as input_file:
reader = DictReader(f=input_file)
with open("output.csv", 'w') as output_file:
writer = DictWriter(f=output_file, fieldnames=reader.fieldnames)
for row in reader:
writer.writerow(row)
Upvotes: 9