Reputation: 298
I am trying to use python's csv
standard library module to generate comma-separated value (csv) files.
It will not allow the first line to be blank. More annoyingly, it treats the first line differently from other lines, so an empty list gives an empty string ("") in one case and a blank line thereafter:
import csv
import io
def make_csv(rows):
with io.StringIO(newline='') as sout:
writer = csv.writer(sout, quoting=csv.QUOTE_MINIMAL)
writer.writerows(rows)
return sout.getvalue()
Given the above definition, an interpreter session might look like:
>>> make_csv([[''], ['']]) # (only the) first line has quoted empty string
'""\r\n\r\n'
>>> make_csv([['A'], ['A']]) # expected: same input row, same output row
'A\r\nA\r\n'
Why does this quoted empty string happen only on the first line? Is there any way I can stop it, or at least get more consistent behavior?
Update: this is a bug reported in Dec 2017 as https://bugs.python.org/issue32255, and resolved by commit https://github.com/python/cpython/commit/2001900b0c02a397d8cf1d776a7cc7fcb2a463e3, which was included in the 3.6.5 release
Upvotes: 1
Views: 1027
Reputation: 43136
You can force the csv writer to quote the empty strings by setting a different quoting strategy. Both QUOTE_ALL and QUOTE_NONNUMERIC will do what you want:
def make_csv(rows):
with io.StringIO(newline='') as sout:
writer = csv.writer(sout, quoting=csv.QUOTE_NONNUMERIC)
writer.writerows(rows)
return sout.getvalue()
>>> make_csv([[''], ['']])
'""\r\n""\r\n'
I don't know why the default strategy treats the first line differently than other lines, but I believe it's a bug. If you try to load the csv data where the 2nd line isn't quoted, you'll notice that the output is different than the input you originally used to create the csv:
>>> data = [[''], ['']]
>>> text = make_csv(data)
>>> text
'""\r\n\r\n'
>>> f = io.StringIO(text)
>>> reader = csv.reader(f)
>>> list(reader)
[[''], []]
Upvotes: 2