Reputation: 407
I'm working on a JSON file (for MongoDB) and need to convert a field name to Database Reference. I'm attempting to do it via sed (though I'm open to solutions using awk, etc), but I'm a complete noob with the tool and am struggling.
Input:
...
"FECTransID" : 4030720141206780377,
"CID" : "N00031103",
"CmteID" : "C00465971",
"RecipCode" : "RW",
"Amount" : 500,
....
Output needed:
...
"FECTransID" : 4030720141206780377,
"CID" : "N00031103",
"CmteID" : {
"ref" : "Cmtes",
"$id" : "C00278101",
"$db" : "OpenSecrets"
},
"RecipCode" : "RW",
"Amount" : 500,
....
My sed
command attempt is:
sed -r 's/\"CmteID\" \: \(\"[\w\d]\{9\}\",\)/\"CmteID\" : { \
\"ref\" : \"Cmtes\", \
\"$id\" : \1 \
\"$db\" : \"OpenSecrets\" \
}/' <IN_FILE >OUT_FILE
but I get this error when I run it:
sed: -e expression #1, char 198: invalid reference \1 on `s' command's RHS
Any help would be appreciated. Thanks.
Upvotes: 2
Views: 71
Reputation: 58483
This might work for you (GNU sed):
sed -r 's/"CmteID" : (.*)/"CmteID" : { \
"ref" : "Cmtes", \
"$id" : \1 \
"$db" : "OpenSecrets" \
},/' fileIn >fileOut
This was a case of over quoting. The parens grouping the $id
had been quoted unneccessarily as the -r
was inforce.
Upvotes: 0
Reputation: 67507
awk
to the rescue!
$ awk '$1=="\"CmteID\""{print $1 ": {";
print "\t\"ref\" : \"Cmtes\",";
print "\t\"$id\" : "$3;
print "\t\"$db\" : \"OpenSecrets\",";
print "},";
next}1' jsonfile
...
"FECTransID" : 4030720141206780377,
"CID" : "N00031103",
"CmteID": {
"ref" : "Cmtes",
"$id" : "C00465971",
"$db" : "OpenSecrets",
},
"RecipCode" : "RW",
"Amount" : 500,
....
with some cleanup
$ awk -v NT="\n\t" 'function q(x) {return "\""x"\"";};
$1==q("CmteID") {$3 = " {"
NT q("ref") " : " q("Cmtes") ","
NT q("$id") " : " $3
NT q("$db") " : " q("OpenSecrets")
",\n},"}1' jsonfile
...
"FECTransID" : 4030720141206780377,
"CID" : "N00031103",
"CmteID" : {
"ref" : "Cmtes",
"$id" : "C00465971",
"$db" : "OpenSecrets",
},
"RecipCode" : "RW",
"Amount" : 500,
....
Upvotes: 1
Reputation: 14955
An awk
approach:
awk '$1=="\"CmteID\"" {$3="{\n\t\"ref\" : \"Cmtes\",\
\n\t\"\$id\" : "$3"\
\n\t\"\$db\" : \"OpenSecrets\"\n},"}1' infile
Explanation
When the first field is matched $1=="\"CmteID\""
we are changing the third field for the expected string, the only variable part is CmteID
value , assigned in: \n\t\"\$id\" : "$3"
Line breaks added (escape char \
) to improve the clarity of the code.
Results
"FECTransID" : 4030720141206780377,
"CID" : "N00031103",
"CmteID" : {
"ref" : "Cmtes",
"$id" : "C00465971",
"$db" : "OpenSecrets"
},
"RecipCode" : "RW",
"Amount" : 500,
Upvotes: 2
Reputation: 204055
sed is for simple substitutions on individual lines, that is all. This problem is not like that, so this is not a job for sed.
$ cat tst.awk
BEGIN { FS=OFS=" : " }
$1 == "\"CmteID\"" {
print $1, "{"
print " \"ref\"", "\"Cmtes\""
print " \"$id\"", $2
print " \"$db\"", "\"OpenSecrets\""
$0 = "},"
}
{ print }
$ awk -f tst.awk file
...
TransID" : 4030720141206780377,
"CID" : "N00031103",
"CmteID" : {
"ref" : "Cmtes"
"$id" : "C00465971",
"$db" : "OpenSecrets"
},
"RecipCode" : "RW",
"Amount" : 500,
....
Upvotes: 1
Reputation: 42712
Many languages have built-in JSON parsers. PHP is one of them:
#!/usr/bin/php
<?php
$infile = $argv[1];
$outfile = $argv[2];
$data = json_decode(file_get_contents($infile));
$id = $data["CmteID"];
$data["CmteID"] = array("ref"=>"Cmtes", "\$id"=>$id, "\$db"=>"OpenSecrets");
file_put_contents($outfile, json_encode($data));
Untested but it should work. Make it executable and call ./myscript.php IN_FILE OUT_FILE
.
My main point being, JSON is not text and using text-replacement on it can lead to problems, just like other structured data formats like XML!
Upvotes: 0