Shahar Hamuzim Rajuan
Shahar Hamuzim Rajuan

Reputation: 6129

sed replace between 2 patterns of JSON file that have multi occurrences

I need to replace whatever expressions in between 2 patterns of JSON file, those patterns are multi occurrences and I would like to replace them only once by my choice (let's say in the 4th occurrence out of 6).

I've created a sed expression that works when I have only one occurrence in the file, but when adding more than one it is for some reason doesn't work when trying to replace the second occurrence.

This is my sed:

images_one_line=MAMA

sed -i 's/CDATA\[.*\]\]>/CDATA[ '"${images_one_line}"' ]]>/2' response.json

This is the response.json file:

{"id":"1929905018","type":"page","status":"current","title":"CODEFRESH-CHANGE-API","version":{"by":{"type":"known","accountId":"6066297300b79f0070e308cb","accountType":"atlassian","email":"[email protected]","publicName":"shachar.rajuan","profilePicture":{"path":"/wiki/aa-avatar/6066297300b79f0070e308cb","width":48,"height":48,"isDefault":false},"displayName":"Shachar Rajuan","isExternalCollaborator":false,"_expandable":{"operations":"","personalSpace":""},"_links":{"self":"https://crossixsolutions.atlassian.net/wiki/rest/api/user?accountId=6066297300b79f0070e308cb"}},"when":"2021-06-07T16:00:41.499Z","friendlyWhen":"less than a minute ago","message":"","number":128,"minorEdit":false,"syncRev":"4.confluence$content$1929905018.362","syncRevSource":"synchrony-ack","confRev":"confluence$content$1929905018.363","contentTypeModified":false,"_expandable":{"collaborators":"","content":"/rest/api/content/1929905018"},"_links":{"self":"https://crossixsolutions.atlassian.net/wiki/rest/api/content/1929905018/version/128"}},"macroRenderedOutput":{},"body":{"storage":{"value":"<p /><table data-layout=\"default\"><colgroup><col style=\"width: 70.0px;\" /><col style=\"width: 65.0px;\" /><col style=\"width: 222.0px;\" /><col style=\"width: 48.0px;\" /><col style=\"width: 70.0px;\" /><col style=\"width: 48.0px;\" /><col style=\"width: 65.0px;\" /><col style=\"width: 76.0px;\" /><col style=\"width: 96.0px;\" /></colgroup><tbody><tr><th><p><strong>Environment</strong></p></th><th><p><strong>Data Loaded</strong></p></th><th><p><strong>version/branch</strong></p></th><th><p><strong>Owner</strong></p></th><th><p><strong>Used for</strong></p></th><th colspan=\"3\"><p style=\"text-align: center;\"><strong>Cluster size</strong></p></th><th><p><strong>env prefix</strong></p></th></tr><tr><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p><strong>Size</strong></p></td><td data-highlight-colour=\"#f4f5f7\"><p><strong>ES - nodes</strong></p></td><td data-highlight-colour=\"#f4f5f7\"><p><strong>Mongo - Shards</strong></p></td><td data-highlight-colour=\"#f4f5f7\"><p /></td></tr><tr><td><p><span style=\"color: rgb(76,154,255);\">cds-dev-1</span></p></td><td><p>10 node, 1 day,  33 files </p></td><td><ac:structured-macro ac:name=\"warning\" ac:schema-version=\"1\" ac:macro-id=\"20ce769a-0655-4cb5-92ae-8238e47d6729\"><ac:rich-text-body><p>This field is being updated automatically using CodeFresh Build.<br />Do not change it manually! </p></ac:rich-text-body></ac:structured-macro><ac:structured-macro ac:name=\"code\" ac:schema-version=\"1\" ac:macro-id=\"c16e8ce6-aad5-4778-a2f4-eaff5bc3a0f5\"><ac:parameter ac:name=\"language\">bash</ac:parameter><ac:plain-text-body><![CDATA[ \ncds-feeder-srv\n  newTag: devleap \ncds-shard-es-srv\n  newTag: master_fuffffffu \ncds-loader-srv\n  newTag: shahar-pip2  ]]></ac:plain-text-body></ac:structured-macro><p /></td><td><p>DEV/Atara</p></td><td><p>functional tests</p></td><td><p>tiny</p></td><td><p>2 nodes</p></td><td><p>3 shards, M40</p></td><td><p /></td></tr><tr><td><p><span style=\"color: rgb(76,154,255);\">cds-dev-2</span></p></td><td><p>1 node</p></td><td><ac:structured-macro ac:name=\"warning\" ac:schema-version=\"1\" ac:macro-id=\"e233aff4-ae50-4c86-a7c2-188d9b2c03e6\"><ac:rich-text-body><p>This field is being updated automatically using CodeFresh Build.<br />Do not change it manually! </p></ac:rich-text-body></ac:structured-macro><p /><ac:structured-macro ac:name=\"code\" ac:schema-version=\"1\" ac:macro-id=\"985a38af-9f30-4329-a763-368a0ebb3984\"><ac:parameter ac:name=\"language\">bash</ac:parameter><ac:plain-text-body><![CDATA[ cds-feeder-srv\n  newTag: testing ]]></ac:plain-text-body></ac:structured-macro></td><td><p>shalom</p></td><td><p>shalom</p></td><td><p>small</p></td><td><p>1 node</p></td><td><p>1 shard</p></td><td><p /></td></tr></tbody></table><p />","representation":"storage","embeddedContent":[],"_expandable":{"content":"/rest/api/content/1929905018"}},"_expandable":{"editor":"","atlas_doc_format":"","view":"","export_view":"","styled_view":"","dynamic":"","editor2":"","anonymous_export_view":""}},"extensions":{"position":363304566},"_expandable":{"childTypes":"","container":"/rest/api/space/SD","metadata":"","operations":"","schedulePublishDate":"","children":"/rest/api/content/1929905018/child","restrictions":"/rest/api/content/1929905018/restriction/byOperation","history":"/rest/api/content/1929905018/history","ancestors":"","descendants":"/rest/api/content/1929905018/descendant","space":"/rest/api/space/SD"},"_links":{"editui":"/pages/resumedraft.action?draftId=1929905018","webui":"/spaces/SD/pages/1929905018/CODEFRESH-CHANGE-API","context":"/wiki","self":"https://crossixsolutions.atlassian.net/wiki/rest/api/content/1929905018","tinyui":"/x/egMIcw","collection":"/rest/api/content","base":"https://crossixsolutions.atlassian.net/wiki"}}

What my sed is doing is replacing everything between CDATA[ and ]], What I am getting when using the global option /g is:

{"id":"1929905018","type":"page","status":"current","title":"CODEFRESH-CHANGE-API","version":{"by":{"type":"known","accountId":"6066297300b79f0070e308cb","accountType":"atlassian","email":"[email protected]","publicName":"shachar.rajuan","profilePicture":{"path":"/wiki/aa-avatar/6066297300b79f0070e308cb","width":48,"height":48,"isDefault":false},"displayName":"Shachar Rajuan","isExternalCollaborator":false,"_expandable":{"operations":"","personalSpace":""},"_links":{"self":"https://crossixsolutions.atlassian.net/wiki/rest/api/user?accountId=6066297300b79f0070e308cb"}},"when":"2021-06-07T16:14:01.265Z","friendlyWhen":"less than a minute ago","message":"Reverted from v. 134","number":136,"minorEdit":false,"syncRev":"4.confluence$content$1929905018.377","syncRevSource":"synchrony-ack","confRev":"confluence$content$1929905018.379","contentTypeModified":false,"_expandable":{"collaborators":"","content":"/rest/api/content/1929905018"},"_links":{"self":"https://crossixsolutions.atlassian.net/wiki/rest/api/content/1929905018/version/136"}},"macroRenderedOutput":{},"body":{"storage":{"value":"<p /><table data-layout=\"default\"><colgroup><col style=\"width: 70.0px;\" /><col style=\"width: 65.0px;\" /><col style=\"width: 222.0px;\" /><col style=\"width: 48.0px;\" /><col style=\"width: 70.0px;\" /><col style=\"width: 48.0px;\" /><col style=\"width: 65.0px;\" /><col style=\"width: 76.0px;\" /><col style=\"width: 96.0px;\" /></colgroup><tbody><tr><th><p><strong>Environment</strong></p></th><th><p><strong>Data Loaded</strong></p></th><th><p><strong>version/branch</strong></p></th><th><p><strong>Owner</strong></p></th><th><p><strong>Used for</strong></p></th><th colspan=\"3\"><p style=\"text-align: center;\"><strong>Cluster size</strong></p></th><th><p><strong>env prefix</strong></p></th></tr><tr><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p /></td><td data-highlight-colour=\"#f4f5f7\"><p><strong>Size</strong></p></td><td data-highlight-colour=\"#f4f5f7\"><p><strong>ES - nodes</strong></p></td><td data-highlight-colour=\"#f4f5f7\"><p><strong>Mongo - Shards</strong></p></td><td data-highlight-colour=\"#f4f5f7\"><p /></td></tr><tr><td><p><span style=\"color: rgb(76,154,255);\">cds-dev-1</span></p></td><td><p>10 node, 1 day,  33 files </p></td><td><ac:structured-macro ac:name=\"warning\" ac:schema-version=\"1\" ac:macro-id=\"20ce769a-0655-4cb5-92ae-8238e47d6729\"><ac:rich-text-body><p>This field is being updated automatically using CodeFresh Build.<br />Do not change it manually! </p></ac:rich-text-body></ac:structured-macro><ac:structured-macro ac:name=\"code\" ac:schema-version=\"1\" ac:macro-id=\"c16e8ce6-aad5-4778-a2f4-eaff5bc3a0f5\"><ac:parameter ac:name=\"language\">bash</ac:parameter><ac:plain-text-body><![CDATA[ cds-feeder-srv newTag: buba cds-shard-es-srv newTag: release3 cds-loader-srv newTag: pip6 ]]></ac:plain-text-body></ac:structured-macro></td><td><p>shalom</p></td><td><p>shalom</p></td><td><p>small</p></td><td><p>1 node</p></td><td><p>1 shard</p></td><td><p /></td></tr></tbody></table><p />","representation":"storage","embeddedContent":[],"_expandable":{"content":"/rest/api/content/1929905018"}},"_expandable":{"editor":"","atlas_doc_format":"","view":"","export_view":"","styled_view":"","dynamic":"","editor2":"","anonymous_export_view":""}},"extensions":{"position":363304566},"_expandable":{"childTypes":"","container":"/rest/api/space/SD","metadata":"","operations":"","schedulePublishDate":"","children":"/rest/api/content/1929905018/child","restrictions":"/rest/api/content/1929905018/restriction/byOperation","history":"/rest/api/content/1929905018/history","ancestors":"","descendants":"/rest/api/content/1929905018/descendant","space":"/rest/api/space/SD"},"_links":{"editui":"/pages/resumedraft.action?draftId=1929905018","webui":"/spaces/SD/pages/1929905018/CODEFRESH-CHANGE-API","context":"/wiki","self":"https://crossixsolutions.atlassian.net/wiki/rest/api/content/1929905018","tinyui":"/x/egMIcw","collection":"/rest/api/content","base":"https://crossixsolutions.atlassian.net/wiki"}}

And when using \2 which should replace the s2d occurrence, I'm seeing no changing.

Upvotes: 1

Views: 109

Answers (1)

Jonathan Leffler
Jonathan Leffler

Reputation: 754570

That's close to 5 KiB of JSON on a single line — it's a pain to try reading it.

There are two sequences of [CDATA[…]] — the first is about 140 characters long, the second about 45 characters long. Your primary problem is that the .* notation in your sed script is 'greedy'; it will start matching after the first CDATA and read until the end of the second. You need to restrict it so it doesn't skip the ]] end marker. That's not trivial. A moderate approximation is:

sed -e 's/CDATA\[[^]]*\]\]>/CDATA[ '"${images_one_line}"' ]]>/' response.json

That doesn't catch the ![[ before the CDATA marker; that presumably doesn't matter.

You can then use the GNU sed extension to select which occurrence of CDATA on the line to change — add a suitable number after the close / of the s/// command.

Note that you may still run into trouble if the data in the CDATA includes any close square brackets.

This version uses sed 'extended regular expression' or ERE support. That's usually enabled by -E, but GNU sed also recognizes -r:

sed -E -e 's/<!\[CDATA\[([^]]*|\][^]])*\]\]>/<![CDATA[ substituted ]]>/g' response.json

A Perl-compatible regex would be better — but still non-trivial. This Perl script shows a working version:

#!/usr/bin/env perl

use strict;
use warnings;

while (<>)
{
    s/<!\[CDATA\[(?:[^]]*|\][^]])*\]\]>/<![CDATA[ substituted ]]>/g;
    print;
}

Editing JSON data like this is problematic, as also noted in the comments.

Upvotes: 2

Related Questions