Reputation: 234
Python: 3.x
Hi. i have below csv file, which has header and rows. rows count may vary file to file. i am trying to convert this csv to a dict format and data is being repeated for first row.
"cdrRecordType","globalCallID_callManagerId","globalCallID_callId"
1,3,9294899
1,3,9294933
Code:
parserd_list = []
output_dict = {}
with open("files\\CUCMdummy.csv") as myfile:
firstline = True
for line in myfile:
if firstline:
mykeys = ''.join(line.split()).split(',')
firstline = False
else:
values = ''.join(line.split()).split(',')
for n in range(len(mykeys)):
output_dict[mykeys[n].rstrip('"').lstrip('"')] = values[n].rstrip('"').lstrip('"')
print(output_dict)
parserd_list.append(output_dict)
#print(parserd_list)
(Generally my csv column count is more than 20, but i have presented a sample file.)
(i have used rstrip/lstrip to get rid of double quotes.)
Output getting:
{'cdrRecordType': '1'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}
this is the output of print
inside for
loop. and final output is also the same.
i dont know what mistake i am doing. Someone please help correct it.
thanks in advance.
Upvotes: 4
Views: 3286
Reputation: 6877
Instead of manually parsing a CSV file, you should use the csv
module.
This will result in a simpler script and will facilitate gracefully handling edge cases (e.g. header row, inconsistently quoted fields, etc.).
import csv
with open('example.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
print(row)
Output:
$ python3 parse-csv.py
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294899')])
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294933')])
If you're intent on parsing manually, here's an approach for doing so:
parsed_list = []
with open('example.csv') as myfile:
firstline = True
for line in myfile:
# Strip leading/trailing whitespace and split into a list of values.
values = line.strip().split(',')
# Remove surrounding double quotes from each value, if they exist.
values = [v.strip('"') for v in values]
# Use the first line as keys.
if firstline:
keys = values
firstline = False
# Skip to the next iteration of the for loop.
continue
parsed_list.append(dict(zip(keys, values)))
for p in parsed_list:
print(p)
Output:
$ python3 manual-parse-csv.py
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}
Upvotes: 7
Reputation: 1350
The indentation of your code is wrong.
These two lines:
print(output_dict)
parserd_list.append(output_dict)
can simply be un-indented to be on the same line as the for loop above them. On top of this, you need to set a new dict for each new file line.
You can do this:
output_dict = {}
right before the for loop for the keys.
As mentioned above there are some libraries that will make life easier. But if you want to stick to appending dictionaries, you can load the lines of the file, close it, and process the lines as such also:
with open("scratch.txt") as myfile:
data = myfile.readlines()
keys = data[0].replace('"','').strip().split(',')
output_dicts = []
for line in data[1:]:
values = line.strip().split(',')
output_dicts.append(dict(zip(keys, values)))
print output_dicts
[{'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899', 'cdrRecordType': '1'}, {'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933', 'cdrRecordType': '1'}]
Upvotes: 0
Reputation: 1174
use csv.DictReader
import csv
with open("files\\CUCMdummy.csv", mode='r',newline='\n') as myFile:
reader = list(csv.DictReader(myFile, delimiter=',',quotechar='"'))
Upvotes: 3