Reputation: 49
I have a tsv file with content like:
'"1"\t"2"\t"3"\t"4"\n'
'"5"\t"6\n7"\t"8"\t"9"\n'
I want to be able to ignore \n symbols inside double quotes, but file's readline() method read it like:
1 2 3 4
5 6
7 8 9
What I want is:
1 2 3 4
5 6\n7 8 9
I tried to pass it to the newline parameter:
f = open('file.tsv', newline = '"\n')
f = open('file.tsv', newline = '\"\n')
But I get
ValueError: illegal newline value: "
Upvotes: 0
Views: 611
Reputation: 8059
Readline method doesn't parce tsv-like strings, just reads the content as it is. Python provides CSV package to read such files:
import csv
with open('test.csv') as csvfile:
reader = csv.reader(csvfile, delimiter='\t', quotechar='"')
for row in reader:
print(', '.join(row))
Read more abuot this package in the docs.
BTW, your trick with newline argument throws an error because it can be None
, ''
, '\n'
, '\r'
, and '\r\n'
.
Note that newline only applies to text mode.
UPDATE:
See the example with topicstarter's data below:
from io import StringIO
file = StringIO("""'"1"\t"2"\t"3"\t"4"'
'"5"\t"6\n7"\t"8"\t"9"'""")
reader =csv.reader(file, delimiter='\t', quotechar='"')
for row in reader:
print(row)
Output:
['\'"1"', '2', '3', "4'"]
['\'"5"', '6\n7', '8', "9'"]
So it works as expected - splitting cannot be performed right when you open the file, csv.reader
working with file object, not with the lines.
Upvotes: 3