Reputation: 610
I'd like to remove a stringfied json's property based on its key wherever it is, whatever its value type is. But removing it only if its value is a string and its on the root level of the object would be nice for a beggining. I tried this:
[,]{1}[\s]*?\"attrName\"[ ]*?[:][ ]*?\".*\"[^,]|\"attrName\"[ ]*?[:][ ]*?\".*\"[,]{0,1}
Example : https://regex101.com/r/PAlqYi/1
but it looks a lot big to do such a simple job, what it does is ensure the comma will be removed as well, if attrName is the first attribute, the last ot something in the middle of the json three. Does anyone has a better idea to make this regex more readable?
Upvotes: 4
Views: 17076
Reputation: 21
Corrected the previous two answers :D
All json syntax consists of quotes, colons and commas. We need to focus on these symbols.
First of all, we need an unescaped quote:
(?<!\\)(?:\\\\)*['"]
The object key in JSON is always a JSON string. A Json string has the following signature: any content wrapped in two identical unescaped quotes:
(?<!\\)(?:\\\\)*('|").*?(?<!\\)(?:\\\\)*\1
Now let's move on to the object: there are three signatures for an object property:
{ property: value } - property\s*:\s*value?=\s*\}
{ ..., property: value } - ,\s*property\s*:\s*value
- comma at the beginning
{ property: value, ... } - property\s*:\s*value\s*,
- comma at the end
Please note that all tokens can be separated by spaces and line breaks - \s*
.
We combine all three cases and get the following expression:
(,\s*property\s*:\s*value)|(property\s*:\s*value(,|(?=\s*\})))
Now we substitute a certain property and value into this signature, where the property is the string attr
, and the value is any string:
(,\s*(?<!\\)(?:\\\\)*('|")attr(?<!\\)(?:\\\\)*\2\s*:\s*(?<!\\)(?:\\\\)*('|").*?(?<!\\)(?:\\\\)*\3)|((?<!\\)(?:\\\\)*('|")attr(?<!\\)(?:\\\\)*\5\s*:\s*(?<!\\)(?:\\\\)*('|").*?(?<!\\)(?:\\\\)*\6(,|(?=\s*\})))
This solution will work exactly as you expect. The two previous answers work in a similar way, but contain many errors.
https://regex101.com/r/3GKXAs/1
Upvotes: 2
Reputation: 570
If you have any way of using a parser it's a more stable and readable solution. The regex \s*\"attr\" *: *\".*\"(,|(?=\s*\}))
should be shorter and better.
Several changes I made to help:
[,]
. If there is only one element in a character class it should be left by itself.{0,1}
is ?
and {1}
is pointless.}
following the line allows you to group the conditionals together. }
so it wouldn't be removed during the substitution.Update with bugfix mentioned in comments. Trailing commas would be left if the attribute is last. The simplest way I found to fix this was to match both cases. So, you'll have to fill in attr twice.
(,\s*\"attr\" *: *\".*\"|(?=\s*\}))|(\s*\"attr\" *: *\".*\"(,|(?=\s*\})))
Examples with added tests cases
Upvotes: 8
Reputation: 41
I modified the regex from the first example, it works better even if is Flat JSON
\s*\"attr\" *: *(\"(.*?)\"(,|\s|)|\s*\{(.*?)\}(,|\s|))
Upvotes: 4