steven
steven

Reputation: 236

Using csv.writer to write a string to StringIO, why does it add extra characters?

How come when I write this string to StringIO the formatting changes?

header = '\r\n'.join(
    [unicode(line,'utf8') for line in 
     ['"Text"',
    '"More Text"',
     '',]])
print header

Output:

"Text"
"More Text"

And now adding it to my StringIO:

si = StringIO.StringIO()

writer = csv.writer(si)
writer.writerow(header))

si.getvalue()

Output:

'"""",T,e,x,t,"""","\r","\n","""",M,o,r,e, ,T,e,x,t,"""","\r","\n"\r\n'

Why is it adding commas and extra " characters?

Upvotes: 3

Views: 1956

Answers (1)

myaut
myaut

Reputation: 11504

That is because writer.addrow expects an iterable containing elements and strings an iterables too.

I.e. this code:

l = [1,2,3]
for i in l:
    print i

will print:

1
2
3

Same principle applies to strings:

s = 'abc'
for c in s:
    print c

will print:

a
b
c

Finally,

writer.writerow([1,2,3])   # Gives you 1,2,3
writer.writerow('abc')     # Gives you a,b,c

And since header is a string in your example, every character in it is treated as separate row. This however, leads to a proper row:

writer.writerow(['abc'])   # Gives you abc

Finally, many csv dialects use quotes when some elements have delimiters in them, i.e. here first comma is not treated as delimiter because it is located within quotes:

writer.writerow(['a,b',3])    # Gives you "a,b",3

When quote character itself appears in element, it is also have to be escaped so it won't confuse parser. If Dialect.doublequote flag is enabled, csv writer will simply double it:

writer.writerow(['a",b',3])    # "a"",b",3

Upvotes: 2

Related Questions