Reputation: 3
Im trying to write the most efficient way to escape double quote marks (") from a a json feed that contains quote marks in incorrect places.
ie
{ "count": "1", "query": "www.mydomain.com/watchlive/type/livedvr/event/69167/"%20%20sTyLe=X:eX/**/pReSsIoN(window.location=56237)%20"", "error": "500"}
there are three keys above - count, query and error. The value in "query" is invalid as the extra double quotes are rendering a invalid json.
If I escape it using \" then the json is valid and can be parsed by the PHP engine, but since the json can have over 5000 sets of data, I cant just manually go and change the offending line(s).
I know that using a combination of preg_match and str_replace will work, but its very messy and not maintainable code. I need the reg_ex to use in something like this
$buffer = '{ "count": "1", "query": "www.mydomain.com/watchlive/type/livedvr/event/69167/"%20%20sTyLe=X:eX/**/pReSsIoN(window.location=56237)%20"", "error": "500"}'
preg_match('/(query": ")(.*)(", "error)/', $buffer , $match);
Thanks in advance
Upvotes: 0
Views: 136
Reputation: 20486
Match and replace using this expression:
(?:"query"\s*:\s*"|(?<!\A)\G)[^"]*\K"(?=.*?",)
\"
In PHP, this would use preg_replace()
:
$buffer = preg_replace('/(?:"query"\s*:\s*"|(?<!\A)\G)[^"]*\K"(?=.*?",)/', '\"', $buffer);
var_dump($buffer);
Explanation:
(?: # Start non-capturing group
"query"\s*:\s*" # Match "query":" literally, with optional whitespace
| # OR
(?<!\A) # Make sure we are not at the beginning of the string
\G # Start at the end of last match
) # End non-capturing
[^"]* # Go through non-" characters
\K # Remove everything to the left from the match
" # Match " (this will be the only thing matched and replaced)
(?= # Start lookahead group
.*?", # Lazily match up until the ", (this is the end of the JSON value)
) # End lookahead group
Upvotes: 2