Reputation: 2470
I have a page which contains //script[@data-type="application/ld+json"]
the contents of this script are similar to the following.
<script>
{
"one": "some text here",
"two": "some "other" text here"
}
</script>
Is it possible to replace double quotes with single quotes using regex so I have:
"two": "some 'other' text here"
Or just remove the inner quotes completely
I can use the replace
function
The main problem is I don't know how to match only quotes inside of a string.
Upvotes: 0
Views: 458
Reputation: 16
If it is like this, maybe you should try something like the below regex.
"(?=\w+"| )(?!\w+":)
I don't have all of your scope, I just wrote based on your pattern that you put here.
You can test your regex on Sublime or https://regexr.com/
Upvotes: 0
Reputation: 163342
In general, it can't be done because your content is ambiguous. Consider:
{
"one": "some text here",
"two": "some ", "three": " text here"
}
You would have to adopt some rule like saying that the "
after some
is treated as a terminal quote if followed by ,
or }
(optionally preceded by whitespace), or as the start quote of a nested string otherwise. That kind of logic seems far beyond what you can express in regular expressions. And in any case, it will sometimes give you the wrong answer.
Upvotes: 4