user2556342
user2556342

Reputation: 55

jq: Conditionally update/replace/add json elements using an input file

I receive the following input file:

  • input.json:
[
 {"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
 {"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":18,"FLAG":"0"},
 {"ID":"aaa_12301248","time_CET":"00:30:00","VALUE":160,"FLAG":"0"},

 {"ID":"bbb_0021122","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
 {"ID":"bbb_0021122","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
 {"ID":"bbb_0021122","time_CET":"00:30:00","VALUE":22,"FLAG":"0"},

 {"ID":"ccc_0021122","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
 {"ID":"ccc_0021122","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
 {"ID":"ccc_0021122","time_CET":"00:30:00","VALUE":20,"FLAG":"0"},

 {"ID":"ddd_122455","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
 {"ID":"ddd_122455","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
 {"ID":"ddd_122455","time_CET":"00:30:00","VALUE":null,"FLAG":"?"},
]

As you can see there are some valid values (FLAG: 0) and some invalid values (FLAG: "?"). Now I got a file looking like this (one for each ID):

aaa.json:

[
  {"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
  {"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
  {"ID":"aaa_12301248","time_CET":"00:55:00","VALUE":45,"FLAG":"0"}
]

As you can see, object one is the same as in input.json but object two is invalid (FLAG: "?"). That's why object two has to be replaced by the correct object from input.json (with VALUE:18). Objects can be identified by "time_CET" and "ID" element.

Additionally, there will be new objects in input.json, that have not been part of aaa.json etc. These objects should be added to the array, and valid objects from aaa.json should be kept.

In the end, aaa.json should look like this:

[
  {"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
  {"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":18,"FLAG":"0"},
  {"ID":"aaa_12301248","time_CET":"00:30:00","VALUE":160,"FLAG":"0"},
  {"ID":"aaa_12301248","time_CET":"00:55:00","VALUE":45,"FLAG":"0"}
]

So, to summarize:

  1. look for FLAG: "?" in aaa.json
  2. replace this object with matching object from input.json using "ID" and "time_CET" for mapping.
  3. Keep exisiting valid objects and add objects from input.json that did not exist in aaa.json before (this means only objects starting with "aaa" in "ID" field)
  4. repeat this for bbb.json, ccc.json and ddd.json

I am not sure if it's possible to get this done all at once with a command like this, because the output has to go to back to the correct id files (aaa, bbb ccc.json):

jq --argfile aaa aaa.json --argfile bbb bbb.json .... -f prog.jq input.json

The problem is, that the number after the identifier (aaa, bbb, ccc etc.) may change. So to make sure objects are added to the correct file/array, a statement like this would be required:
if (."ID"|contains("aaa")) then ....

Or is it better to run the program several times with different input parameters? I am not sure..

Thank you in advance!!

Upvotes: 0

Views: 1556

Answers (1)

jq170727
jq170727

Reputation: 14625

Here is one approach

#!/bin/bash

# usage: update.sh input.json aaa.json bbb.json....
# updates each of aaa.json bbb.json.... 

input_json="$1"
shift

for i in "$@"; do
    jq -M --argfile input_json "$input_json" '

      # functions to restrict input.json to keys of current xxx.json file
      def prefix:              input_filename | split(".")[0];
      def selectprefix:        select(.ID | startswith(prefix));

      # functions to build and probe a lookup table
      def pk:                  [.ID, .time_CET];
      def lookup($t;$k):       $t | getpath($k);
      def lookup($t):          lookup($t;pk);
      def organize(s):         reduce s as $r ({}; setpath($r|pk; $r));

      # functions to identify objects in input.json missing from xxx.json
      def pks:                 paths | select(length==2);
      def missing($t1;$t2):    [$t1|pks] - [$t2|pks] | .[];
      def getmissing($t1;$t2): [ missing($t1;$t2) as $p | lookup($t1;$p)];

      # main routine
        organize(.[]) as $xxx
      | organize($input_json[] | selectprefix) as $inp
      | map(if .FLAG != "?" then . else . += lookup($inp) end)
      | . + getmissing($inp;$xxx)

    ' "$i" | sponge "$i"

done

The script uses jq in a loop to read and update each aaa.json... file.

The filter creates temporary objects to facilitate looking up values by [ID,time_CET], updates any values in the aaa.json with a FLAG=="?" and finally adds any values from input.json that are missing in aaa.json.

The temporary lookup table for input.json uses input_filename so that only keys starting with a prefix matching the name of the currently processed file will be included.

Sample Run:

$ ./update.sh input.json aaa.json

aaa.json after run:

[
  {
    "ID": "aaa_12301248",
    "time_CET": "00:00:00",
    "VALUE": 10,
    "FLAG": "0"
  },
  {
    "ID": "aaa_12301248",
    "time_CET": "00:15:00",
    "VALUE": 18,
    "FLAG": "0"
  },
  {
    "ID": "aaa_12301248",
    "time_CET": "00:55:00",
    "VALUE": 45,
    "FLAG": "0"
  },
  {
    "ID": "aaa_12301248",
    "time_CET": "00:30:00",
    "VALUE": 160,
    "FLAG": "0"
  }
]

Upvotes: 1

Related Questions