Reputation: 1
I'm trying to read some data and parse it out as CSV. The data format in question comes with a wacky first line that I first need to get rid of.
delimiter = None
with open('data.csv', 'r', encoding='latin1') as fd:
input1 = io.StringIO(fd.read())
with open('data.csv', 'r', encoding='latin1') as fd:
input2 = io.StringIO()
for line in fd:
if line.startswith('sep='):
delimiter = line[4]
else:
input2.write(line)
with open('data.csv', 'r', encoding='latin1') as fd:
buf = ''
for line in fd:
if line.startswith('sep='):
delimiter = line[4]
else:
buf += line
input3 = io.StringIO(buf)
In the case that I do actually add in that first line, then input1.getvalue() == input2.getvalue() == input3.getvalue(). And if I don't then at least input2.getvalue() == input3.getvalue().
Then comes the CSV bit:
inputReader = csv.DictReader(inputX, delimiter=delimiter or ';')
for row in inputReader:
print(row)
This works for input1, but due to the wacky first line it messes up the column names, as expected.
It works for input3, with correct column names. I'm curious though as to why the for loop doesn't return any results for input2. What's the difference between input2 and input3 at that point?
Upvotes: 0
Views: 304
Reputation: 281551
input2
is positioned at the end of the "file", whereas constructing a StringIO
from a string directly places the file position at the start.
To fix the input2
code, seek back to the start once you're done writing:
input2.seek(0)
Upvotes: 1