Reputation: 14011
I am new to Python, and I am currently stumped by this problem:
I have a list of dictionaries generated csv.DictReader
. I have created the list with the function as follows:
def csvToDictList(filename):
reader = csv.DictReader(open(filename, 'rb'))
list = []
for row in reader:
list.append(row)
return (list, reader.fieldnames)
This worked great, but CSV file I am processing has duplicate columns, so I end up with a dictionary like:
[
{'Column1': 'Value1', 'Column2': 'Value2', ... <some unique columns and values> ..., 'Column1': 'Value1', 'Column2': 'Value2'},
...
{'Column1': 'Value1N', 'Column2': 'Value2N', ... <some unique columns and values> ..., 'Column1': 'Value1N', 'Column2': 'Value2N'}
]
My main question is how do I remove duplicate columns out of this dictionary list?
I thought about iterating over each key, and then removing the column when I detect a duplicate key name with something like this:
def removeColumn(dictList, colName):
for row in dictList:
del row[colName]
But, won't that remove both columns? Should I be operating on the hash-keys of the dictionary? Any help is appreciated!
EDIT : The duplicates I was seeing were actually present in the reader.fieldnames
list. So, I was assuming the dictionaries contained these columns as well, which was an incorrect assumption.
Upvotes: 0
Views: 2734
Reputation: 212835
There is nothing like duplicate keys in a dictionary.
If you have more columns with the same name, DictReader will take only the last one (overwriting the previous ones).
For the following CSV file:
a,b,c,a,b
1,2,3,4,5
6,7,8,9,10
the DictReader will return following dicts:
{'a': '4', 'c': '3', 'b': '5'}
{'a': '9', 'c': '8', 'b': '10'}
thus throwing the previous values for a
and b
columns away.
Upvotes: 2