Reputation: 31
In a file that I would like to convert to json I have this:
""status"":200,""ts"":1543039907796,""userAgent"":"Mozilla 5.0""..... <- (It's variable)
I would like to replace ""
for "
, I mean:
"status":200,"ts":1543039907796,"userAgent":"Mozilla 5.0"......
I'm reading a log file with a json format like this:
def process_log_file(cur, filepath):
# open log file
with open(filepath) as json_file:
data = json_file.read().replace('\n', ',').replace('\\"', '').replace('\\/"', '').replace('\/"', ' ').replace('\/', ' ')
df = pd.read_json(data)
In this line: data = json_file.read().replace('\n', ',').replace('\"', '').replace('\/"', '').replace('/"', ' ').replace('/', ' ')
I've tried with:
replace('""', '"')
replace("""""", "")
replace('(")(")','"')
and It does not work. somebody knows why?
Upvotes: 0
Views: 682
Reputation: 2691
Your text to be replaced starts with the double quotation marks. Python accepts them as the string symbols. Try following:
text="""
""status"":200,""ts"":1543039907796,""userAgent"":"Mozilla 5.0""
"""
text.replace('""','"')
This yields:
'\n"status":200,"ts":1543039907796,"userAgent":"Mozilla 5.0"\n'
You could get rid of '\n'
characters in any way.
Upvotes: 0
Reputation: 693
You can be more intelligent than exact string matches in regex, your regex can look like:
re.sub('"+', '"', '""status"":200,""ts"":1543039907796,""userAgent"":"Mozilla 5.0""')
It will replace all double quotes where there are multiple, outputting this:
'"status":200,"ts":1543039907796,"userAgent":"Mozilla 5.0"'
Upvotes: 4