Reputation: 177
I have JSON objects in this format:
{
"1f626": {
"name": "frowning face with open mouth",
"ascii": [],
"code_points": {
"base": "1f626",
"default_matches": [
"1f626"
],
"greedy_matches": [
"1f626"
],
"decimal": ""
}
}
}
I have to remove the code_points
object using Regular Expressions.
I have tried using this RegEx:
(("code\w+)(.*)(}))
But it is only selecting the first line.
I have to select until end of curly brackets in order to fully get rid of the code_points
object.
How can I do that?
Note: I have to remove it using Regular Expressions and not JavaScript. Please don't post any JavaScript answers or mark this as a possible duplicate of a JavaScript-based question.
Upvotes: 3
Views: 3381
Reputation: 20782
Alternatively, at the command-line, if you can use jq
jq "del(.[].code_points)" <monster.json >smaller_monster.json
This deletes the code_points
key inside each 2nd-level object.
It took my machine about 5 seconds on a 60MB document.
It's not a regular expression but it's not JavaScript, either. So, it meets half of your non-functional requirements.
Upvotes: 3
Reputation: 1052
("code_points")([\s\S]*?)(})
The problem you had is that .
is actually any character except \n
, so in this case I usually use [\s\S]
which means any whitespace and non-whitespace character (so it's actually any character). Also you should make *
quantifier to be lazy by adding ?
.
Remember that this Regular Expression won't work properly in case you have inner object (other {}
) in code_points
Upvotes: 2