BharathYes
BharathYes

Reputation: 1037

How to edit out particular json attributes? Can regex on VSCode work?

I have a json file with an array of objects like this:

[
  {
    "_index": "db",
    "_type": "service",
    "_id": "1",
    "_score": 4.0,
    "_source": {
      "contentId": "1",
      "title": "Sample 1",
      "tokenizer": "whitespace",
      "keyword": ["sample1", "service"],
      "desp": "Desc this Service",
      "contentType": "service",
      "url": null,
      "contentCategory": "Services",
      "contentSubCategory": null,
      "assignmentProfile": null,
      "employeeId": null,
      "assignmentProfileId": null,
      "managedRuleId": null,
      "contentAcademy": null,
      "imageUrl": null,
      "metaData": [
        "sample1",
        "services"
      ]
    }
  },
  {
    "_index": "db",
    "_type": "service",
    "_id": "2",
    "_score": 7.0,
    "_source": {
      "contentId": "2",
      "title": "Sample 2",
      "tokenizer": "whitespace",
      "keyword": ["sample2", "service"],
      "desp": "Desc this Service",
      "contentType": "service",
      "url": null,
      "contentCategory": "Services",
      "contentSubCategory": null,
      "assignmentProfile": null,
      "employeeId": null,
      "assignmentProfileId": null,
      "managedRuleId": null,
      "contentAcademy": null,
      "imageUrl": null,
      "metaData": [
        "sample2",
        "services"
      ]
    }
  }
]

I need to remove certain fields in this. All the fields beginning with the _ and metadata field. It needs to end up like this:

[
  {
    "contentId": "1",
    "title": "Sample 1",
    "tokenizer": "whitespace",
    "keyword": ["sample1", "service"],
    "desp": "Desc this Service",
    "contentType": "service",
    "url": null,
    "contentCategory": "Services",
    "contentSubCategory": null,
    "assignmentProfile": null,
    "employeeId": null,
    "assignmentProfileId": null,
    "managedRuleId": null,
    "contentAcademy": null,
    "imageUrl": null
  },
  {
    "contentId": "2",
    "title": "Sample 2",
    "tokenizer": "whitespace",
    "keyword": ["sample2", "service"],
    "desp": "Desc this Service",
    "contentType": "service",
    "url": null,
    "contentCategory": "Services",
    "contentSubCategory": null,
    "assignmentProfile": null,
    "employeeId": null,
    "assignmentProfileId": null,
    "managedRuleId": null,
    "contentAcademy": null,
    "imageUrl": null
  }
]

I want to write a regex expression on VSCode to do the above. I wrote the following:

"metaData": \[\r\n (.+) ],

to replace the metaData attribute with a empty string. But that doesn't match.

The array size is 100+ and thus is there a expression to match this with?

Upvotes: 1

Views: 5964

Answers (2)

Mark
Mark

Reputation: 181916

Try this in vscode:

(^\s*"_.*"?\n)|(,\n^\s*"metaData":\s*\[[\s\S]+?\s+\])|(^\s{4,}\}$\n)|(^\s{2}(?![\{\}]))

and replace with nothing.

For the full explanation see regex101 demo.

There are 4 alternatives strung together:

(^\s*"_.*"?\n) get "_index": "db", for example, including trailing newline

(,\n^\s*"metaData":\s*\[[\s\S]+?\s+\]) get the "metaData": [...] and the preceding , at the the end of the preceding entry so as to eliminate the trailing , on the last entry in every field.

(^\s{4,}\}$\n) get the } that was the closing brace from "_source": { that is no longer needed.

(^\s{2}(?![\{\}])) just to fix the indentation since "_source": {...} was removed - gets the first two spaces on all lines without a { or } following them. [You may have to play with the 2 spaces removed depending on your indentation settings.]

You can eliminate this last alternative if you want to just format the document Shift+Alt+F instead - it should delete those spaces. I noticed though that reformatting in this way will reformat your:

"keyword": ["sample1", "service"], 

to

 "keyword": [
   "sample1",
   "service"
 ],

which you may care about or not.

json regex demo


Upvotes: 4

Milan Tejani
Milan Tejani

Reputation: 372

I have a solution but I am not sure it will work on VSCode

I tried that solution on Sublime Text Editor and it's working fine.So I think it will also work on VSCode.

Steps to Solve your first Problem(line Starting with "_):

  1. Open Your Json file in Sublime Text Editor
  2. Press ctrl+H and then click on .* symbol to enable regex (Make sure .* is enable).
  3. then type this regex ^.*"_.*\n in find section and leave Replace section empty and press Replace (do not write any thing in Replace Section)

Steps to Solve your first Problem(entire block of metaData):

  1. Open Your Json file in Sublime Text Editor
  2. Press ctrl+H and then click on .* symbol to enable regex (Make sure .* is enable).
  3. then type this regex \"metadata[^}]*\] in find section and leave Replace section empty and press Replace (do not write anything in Replace Section)

do not forget to enable .* symbol(See below image)

enter image description here

Upvotes: 1

Related Questions