Reputation: 685
Input File:
["abc","on time","date","<a href='link'>11111</a>","time","2","2"],
["abc","on time","date","<a href='link'>11111</a>","time","2","6"],
["abc","on time","date","<a href='link'>11111</a>","time","2","9"],
["abc","on time","date","<a href='link'>11111</a>","time","2","0"],
["abc","on time","date","<a href='link'>11111</a>","time","2","5"]
output to be needed:
abc,on time,date,<a href='link'>11111</a>,time,2,2
abc,on time,date,<a href='link'>11111</a>,time,2,6
abc,on time,date,<a href='link'>11111</a>,time,2,9
abc,on time,date,<a href='link'>11111</a>,time,2,0
abc,on time,date,<a href='link'>11111</a>,time,2,5
Code tried:
import sys
import re
Lines = [Line.strip() for Line in open (sys.argv[1],'r').readlines()]
for EachLine in Lines:
Parts = EachLine.split(",")
for EachPart in Parts:
EachPart = re.sub(r'[', '', EachPart)
EachPart = re.sub(r']', '', EachPart)
print ' '.join(Parts)
Can anyone help me on this?? I am not getting what i desired. Thanks in advance.
Upvotes: 0
Views: 62
Reputation: 1632
Another option without using regex is:
for line in lines:
formatted = ','.join(line).replace('"', '')
print(formatted)
Upvotes: 0
Reputation: 19805
As already mentioned, you can use eval()
.
with open('a.txt') as f:
for line in f:
line = line.replace(',\n', '\n').strip() # remove if there is `,` at the end
if line: # to tackle with empty lines
print(','.join(eval(line.strip())))
["abc","on time","date","<a href='link'>11111</a>","time","2","2"],
["abc","on time","date","<a href='link'>11111</a>","time","2","6"],
["abc","on time","date","<a href='link'>11111</a>","time","2","9"],
["abc","on time","date","<a href='link'>11111</a>","time","2","0"],
["abc","on time","date","<a href='link'>11111</a>","time","2","5"]
abc,on time,date,<a href='link'>11111</a>,time,2,2
abc,on time,date,<a href='link'>11111</a>,time,2,6
abc,on time,date,<a href='link'>11111</a>,time,2,9
abc,on time,date,<a href='link'>11111</a>,time,2,0
abc,on time,date,<a href='link'>11111</a>,time,2,5
Upvotes: 0
Reputation: 901
I modified your initial solution to
import sys
import re
Lines = [Line.strip() for Line in open (sys.argv[1],'r').readlines()]
for EachLine in Lines:
matches = re.findall(r'\"(.+?)\"',EachLine)
print ','.join(matches)
My approach is to use regex to get all string in double quotes.
Upvotes: 1