Reputation: 657
Hi i have line where i want to replace tab in double quotes. I have wrote script for that but it is not working as I want. My line:
Q3U962 Mus musculus MRMP-mouse Optimization "MRMP-mouse "
My script:
for repline in reppepdata:
findtorep=re.findall(r"['\"](.*?)['\"]", repline)
if len(findtorep) >0:
for repitem in findtorep:
repchar =repitem
repchar=repchar.replace('\t', '')
My output should be:
Q3U962 Mus musculus MRMP-mouse Optimization "MRMP-mouse"
But I am getting like this:
Q3U962 Mus musculus MRMP-mouseOptimization "MRMP-mouse"
Words are separated by tab delimiter here.
Q3U962\tMus musculus\tMRMP-mouse\tOptimization \t"MRMP-mouse\t"
Anyone has any idea how to do it?
Upvotes: 1
Views: 474
Reputation: 626748
NOTE: This answer assumes (it is confirmed by OP) that there are no escaped quotes/sequences in the input.
You may match the quoted string with a simple "[^"]+"
regex that matches a "
, 1+ chars other than "
and a "
, and replace the tabs inside within a lambda:
import re
s = 'Q3U96 Mus musculu MRMP-mous Optimizatio "MRMP-mouse "'
res = re.sub(r'"[^"]+"', lambda m: m.group(0).replace("\t", ""), s)
print(res)
See the Python demo
Upvotes: 1