Reputation: 1037
I have a json file with an array of objects like this:
[
{
"_index": "db",
"_type": "service",
"_id": "1",
"_score": 4.0,
"_source": {
"contentId": "1",
"title": "Sample 1",
"tokenizer": "whitespace",
"keyword": ["sample1", "service"],
"desp": "Desc this Service",
"contentType": "service",
"url": null,
"contentCategory": "Services",
"contentSubCategory": null,
"assignmentProfile": null,
"employeeId": null,
"assignmentProfileId": null,
"managedRuleId": null,
"contentAcademy": null,
"imageUrl": null,
"metaData": [
"sample1",
"services"
]
}
},
{
"_index": "db",
"_type": "service",
"_id": "2",
"_score": 7.0,
"_source": {
"contentId": "2",
"title": "Sample 2",
"tokenizer": "whitespace",
"keyword": ["sample2", "service"],
"desp": "Desc this Service",
"contentType": "service",
"url": null,
"contentCategory": "Services",
"contentSubCategory": null,
"assignmentProfile": null,
"employeeId": null,
"assignmentProfileId": null,
"managedRuleId": null,
"contentAcademy": null,
"imageUrl": null,
"metaData": [
"sample2",
"services"
]
}
}
]
I need to remove certain fields in this. All the fields beginning with the _
and metadata
field. It needs to end up like this:
[
{
"contentId": "1",
"title": "Sample 1",
"tokenizer": "whitespace",
"keyword": ["sample1", "service"],
"desp": "Desc this Service",
"contentType": "service",
"url": null,
"contentCategory": "Services",
"contentSubCategory": null,
"assignmentProfile": null,
"employeeId": null,
"assignmentProfileId": null,
"managedRuleId": null,
"contentAcademy": null,
"imageUrl": null
},
{
"contentId": "2",
"title": "Sample 2",
"tokenizer": "whitespace",
"keyword": ["sample2", "service"],
"desp": "Desc this Service",
"contentType": "service",
"url": null,
"contentCategory": "Services",
"contentSubCategory": null,
"assignmentProfile": null,
"employeeId": null,
"assignmentProfileId": null,
"managedRuleId": null,
"contentAcademy": null,
"imageUrl": null
}
]
I want to write a regex expression on VSCode to do the above. I wrote the following:
"metaData": \[\r\n (.+) ],
to replace the metaData attribute with a empty string. But that doesn't match.
The array size is 100+ and thus is there a expression to match this with?
Upvotes: 1
Views: 5964
Reputation: 181916
Try this in vscode:
(^\s*"_.*"?\n)|(,\n^\s*"metaData":\s*\[[\s\S]+?\s+\])|(^\s{4,}\}$\n)|(^\s{2}(?![\{\}]))
and replace with nothing.
For the full explanation see regex101 demo.
There are 4 alternatives strung together:
(^\s*"_.*"?\n)
get "_index": "db",
for example, including trailing newline
(,\n^\s*"metaData":\s*\[[\s\S]+?\s+\])
get the "metaData": [...]
and the preceding ,
at the the end of the preceding entry so as to eliminate the trailing ,
on the last entry in every field.
(^\s{4,}\}$\n)
get the }
that was the closing brace from "_source": {
that is no longer needed.
(^\s{2}(?![\{\}]))
just to fix the indentation since "_source": {...}
was removed - gets the first two spaces on all lines without a {
or }
following them. [You may have to play with the 2
spaces removed depending on your indentation settings.]
You can eliminate this last alternative if you want to just format the document Shift+Alt+F instead - it should delete those spaces. I noticed though that reformatting in this way will reformat your:
"keyword": ["sample1", "service"],
to
"keyword": [
"sample1",
"service"
],
which you may care about or not.
Upvotes: 4
Reputation: 372
I have a solution but I am not sure it will work on VSCode
I tried that solution on Sublime Text Editor
and it's working fine.So I think it will also work on VSCode
.
Steps to Solve your first Problem(line Starting with "_
):
Sublime Text Editor
ctrl+H
and then click on .*
symbol to enable regex (Make sure .*
is enable).^.*"_.*\n
in find
section and leave Replace
section empty and press Replace
(do not write any thing in Replace
Section)Steps to Solve your first Problem(entire block of metaData
):
Sublime Text Editor
ctrl+H
and then click on .*
symbol to enable regex (Make sure .*
is enable).\"metadata[^}]*\]
in find
section and leave Replace
section empty and press Replace
(do not write anything in Replace
Section)do not forget to enable .*
symbol(See below image)
Upvotes: 1