Reputation: 309
I need to read a pipe(|)-separated text file. One of the fields contains a description that may contain double-quotes. I noticed that all lines that contain a " is missing in the receiving dict. To avoid this, I tried to read the entire line, and use the string.replace() to remove them, as shown below, but it looks like the presence of those quotes creates problem at the line-reading stage, i.e before the string.replace() method.
The code is below, and the question is 'how to force python not to use any separator and keep the line whole ?".
with open(fileIn) as txtextract:
readlines = csv.reader(txtextract,delimiter="µ")
for line in readlines:
(...)
LI_text = newline[107:155]
LI_text.replace("|","/")
LI_text.replace("\"","") # use of escape char don't work.
Note: I am using version 3.6
Upvotes: 2
Views: 2470
Reputation: 309
Lesson: method string.replace() does not change the string itself. The modified text must be stored back (string = string.replace() )
Upvotes: 0
Reputation: 111
You may use regex
In [1]: import re
In [2]: re.sub(r"\"", "", '"remove all "double quotes" from text"')
Out[2]: 'remove all double quotes from text'
In [3]: re.sub(r"(^\"|\"$)", "", '"remove all "only surrounding quotes" from text"')
Out[3]: 'remove all "only surrounding quotes" from text'
or add quote='"'
and quoting=csv.QUOTE_MINIMAL
options to csv.reader()
, like:
with open(fileIn) as txtextract:
readlines = csv.reader(txtextract, delimiter="µ", quote='"', quoting=csv.QUOTE_MINIMAL)
for line in readlines:
(...)
Upvotes: 2