Reputation: 792
I am using the csv module in the following manner
header = '"Id","IsDeleted","MasterRecordId","Salutation","FirstName","LastName","Name","Type","RecordTypeId","ParentId","BillingStreet","BillingCity","BillingState","BillingPostalCode","BillingCountry","BillingLatitude"'
header_c = csv.reader(header, delimiter=',', quotechar='"')
names = []
for row in header_c:
names.append(row)
Inspecting names returns:
[['Id'], ['', ''], ['IsDeleted'], ['', ''], ['MasterRecordId'], ['', ''], ['Salutation'], ['', ''], ['FirstName'], ['', ''], ['LastName'], ['', ''], ['Name'], ['', ''], ['Type'], ['', ''], ['RecordTypeId'], ['', ''], ['ParentId'], ['', ''], ['BillingStreet'], ['', ''], ['BillingCity'], ['', ''], ['BillingState'], ['', ''], ['BillingPostalCode'], ['', ''], ['BillingCountry'], ['', ''], ['BillingLatitude']]
I could ignore all the odd entries, keeping 0, 2, 4, ...., but I don't understand what I am doing wrong and why the commas are being kept as entries. What do I have to change in order for the comma's to be dropped. 'IsDeleted' should be the second entry (names[1])
Thanks in advance.
Upvotes: 2
Views: 1213
Reputation: 1122342
csv.reader()
can handle any iterable, and expects each iteration over that iterable to yield a complete line. The iterable can be a file-like object, or (normally) a list of strings:
header_c = csv.reader([header], delimiter=',', quotechar='"')
If you pass in just a single string object, the string itself is iterated over as if each character was a line, but because of the quotes csv
will continue to read 'lines' until it finds a closing quote character.
The next 'line' contains just a comma, so that is seen as a line of two empty values.
Or, to take the first 5 characters ("Id",
) as an example, csv
does this:
"
. This is a quoted value, so include everything up to the end of the line.I
, append.d
, append."
. Quote closed, yield a complete row ['Id']
.,
. This is a complete line with a delimiter, so yield ['', '']
.Whenever I need to pass in a string value to csv.reader()
I use str.splitlines()
; this method will always return a list, so this works for lines without newlines too:
header_c = csv.reader(header.splitlines(True), delimiter=',', quotechar='"')
I leave in the newlines (pass in True
to str.splitlines()
; quoted values with newlines are then properly returned with the newlines included.
Upvotes: 4
Reputation: 473893
You should pass a file-like object (or any other iterable) to csv.reader as a first parameter.
csv.reader(csvfile, dialect='excel', **fmtparams)
Return a reader object which will iterate over lines in the given csvfile. csvfile can be any object which supports the iterator protocol and returns a string each time its next() method is called — file objects and list objects are both suitable.
One option is to read the string into the StringIO
buffer:
from StringIO import StringIO
header_c = csv.reader(StringIO(header), delimiter=',', quotechar='"')
Then, in names, you'll get:
[['Id', 'IsDeleted', 'MasterRecordId', 'Salutation', 'FirstName', 'LastName', 'Name', 'Type', 'RecordTypeId', 'ParentId', 'BillingStreet', 'BillingCity', 'BillingState', 'BillingPostalCode', 'BillingCountry', 'BillingLatitude']]
Upvotes: 3