Mina
Mina

Reputation: 177

Remove an object from JSON using RegEx

I have JSON objects in this format:

 {
     "1f626": {
         "name": "frowning face with open mouth",
         "ascii": [],
         "code_points": {
             "base": "1f626",
             "default_matches": [
                 "1f626"
             ],
             "greedy_matches": [
                 "1f626"
             ],
             "decimal": ""
         }
     }
 }

I have to remove the code_points object using Regular Expressions.


I have tried using this RegEx:

(("code\w+)(.*)(}))

But it is only selecting the first line. I have to select until end of curly brackets in order to fully get rid of the code_points object.

How can I do that?


Note: I have to remove it using Regular Expressions and not JavaScript. Please don't post any JavaScript answers or mark this as a possible duplicate of a JavaScript-based question.

Upvotes: 3

Views: 3381

Answers (2)

Tom Blodget
Tom Blodget

Reputation: 20782

Alternatively, at the command-line, if you can use jq

jq "del(.[].code_points)" <monster.json >smaller_monster.json

This deletes the code_points key inside each 2nd-level object.

It took my machine about 5 seconds on a 60MB document.

It's not a regular expression but it's not JavaScript, either. So, it meets half of your non-functional requirements.

Upvotes: 3

CrafterKolyan
CrafterKolyan

Reputation: 1052

("code_points")([\s\S]*?)(})

The problem you had is that . is actually any character except \n, so in this case I usually use [\s\S] which means any whitespace and non-whitespace character (so it's actually any character). Also you should make * quantifier to be lazy by adding ?.

Remember that this Regular Expression won't work properly in case you have inner object (other {}) in code_points

Upvotes: 2

Related Questions