Consolidate stream of json objects with jq

Question

Premise: Looking to parse stream of objects from a json log file and output the total number of times "id.orig_h" connects to "id.resp_h" based on certain conditions and show the total count.

Sample json input:

jq --slurp --raw-output . 
   
  {
    "ts": 1636606.998991,
    "uid": "CgbTrLvhqHAa",
    "id.orig_h": "10.8.21.11",
    "id.orig_p": 54858,
    "id.resp_h": "10.8.21.66",
    "id.resp_p": 5044,
    "proto": "tcp",
    "conn_state": "S0",
    "local_orig": true,
    "local_resp": true,
    "missed_bytes": 0,
    "history": "S",
    "orig_pkts": 1,
    "orig_ip_bytes": 60,
    "resp_pkts": 0,
    "resp_ip_bytes": 0
  },
  {
    "ts": 1636638.028568,
    "uid": "CFNumGx3XYWW7",
    "id.orig_h": "fe80::ba:61:fe3f:80",
    "id.orig_p": 130,
    "id.resp_h": "ff02::1",
    "id.resp_p": 131,
    "proto": "icmp",
    "duration": 3420.447889374,
    "orig_bytes": 2608,
    "resp_bytes": 0,
    "conn_state": "OTH",
    "local_orig": false,
    "local_resp": false,
    "missed_bytes": 0,
    "orig_pkts": 163,
    "orig_ip_bytes": 11736,
    "resp_pkts": 0,
    "resp_ip_bytes": 0
  },
  {
    "ts": 1636526872.598889,
    "uid": "Cq9JTE1OweOW6mi",
    "id.orig_h": "fe::63:88:14f5:b5",
    "id.orig_p": 131,
    "id.resp_h": "ff02::fb",
    "id.resp_p": 130,
    "proto": "icmp",
    "duration": 81086.88094513,
    "orig_bytes": 64000,
    "resp_bytes": 0,
    "conn_state": "OTH",
    "local_orig": false,
    "local_resp": false,
    "missed_bytes": 0,
    "orig_pkts": 4000,
    "orig_ip_bytes": 288000,
    "resp_pkts": 0,
    "resp_ip_bytes": 0
  },
  {
    "ts": 1636604547.798971,
    "uid": "Cs41IjaZTAdF7f",
    "id.orig_h": "fe::63:88:14f5:b5",
    "id.orig_p": 131,
    "id.resp_h": "ff02::1:ff:b5",
    "id.resp_p": 130,
    "proto": "icmp",
    "duration": 3414.3990546265,
    "orig_bytes": 2608,
    "resp_bytes": 0,
    "conn_state": "OTH",
    "local_orig": false,
    "local_resp": false,
    "missed_bytes": 0,
    "orig_pkts": 163,
    "orig_ip_bytes": 11736,
    "resp_pkts": 0,
    "resp_ip_bytes": 0
   }

I believe the conditions part is good

    jq -r '. | select(.resp_ip_bytes > 0 and .orig_ip_bytes > 0 and .duration > 0 and .orig_bytes > 0 and .resp_bytes >0)'

However every time I try a

    group_by([."id.orig_h", ."id.resp_h"]),

getting --> Cannot index number with string "id.orig_h"

Desired Output:

1.1.1.1 -> 2.2.2.2 | XXXX <- # of times

Here is the output without the join(" ")

jq -sr 'map(select(.resp_ip_bytes > 0 and .orig_ip_bytes > 0 and .duration > 0 and .orig_bytes > 0 and .resp_bytes >0)) | group_by([."id.orig_h", ."id.resp_h"]) | map(length as $count | .[] | .count = $count) | sort_by([-.count, -.resp_ip_bytes]) | first | [."id.orig_h", "->", ."id.resp_h", "|", .count]'
[
  "10.8.21.11",
  "->",
  "10.8.21.123",
  "|",
  225 <--(not sure it matters but output on .count is yellow, all other output is green)
]

With the join(" ")

string (" ") and number (225) cannot be added

pmf · Accepted Answer

Try this

jq --slurp --raw-output '
  map(select(.resp_ip_bytes > 0 ... your conditions here ...))
  | group_by([."id.orig_h", ."id.resp_h"])
  | map(length as $count | .[] | .count = $count)
  | sort_by([-.count, -.resp_ip_bytes]) | first
  | [."id.orig_h", "->", ."id.resp_h", "|", .count]
  | join(" ")
'

This inserts another field value count with the number of connections, then sorts first by the highest count, then by the highest resp_ip_bytes, takes the first match an formats the output as desired.

Consolidate stream of json objects with jq

Answers (2)

Related Questions