mr i.o
mr i.o

Reputation: 952

Calculate Average values in a JSON array

I am working with a JSON file of this format:

  {
  "Response" : {
    "TimeUnit" : [ 1516298400000, 1516302000000, 1516305600000, 1516309200000, 1516312800000, 1516316400000 ],
    "metaData" : {
      "errors" : [ ],
      "notices" : [ "Source:Postgres", "Limit applied: 14400", "PG Host:ruappg0ro.apigeeks.net", "Metric with Avg of total_response_time was requested. For this a global avg was also computed with name global-avg-total_response_time", "query served by:88bec25a-ef48-464e-b41d-e447e3beeb88", "Table used: edge.api.faxgroupusenondn012.agg_api" ]
    },
    "stats" : {
      "data" : [ {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "test" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 28.0, 129.0, 24.0, 20.0, 71.0, 30.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 312.57142857142856, 344.2480620155039, 374.2083333333333, 381.1, 350.67605633802816, 363.8 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 349.5860927152318 ]
        } ]
      }, {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "test2" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 0.0, 0.0, 0.0, 16.0, 137.0, 100.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 0.0, 0.0, 0.0, 237.4375, 198.02189781021897, 189.44 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 16.0, 137.0, 100.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 197.12252964426878 ]
        } ]
      }, {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "appdyn" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 0.0, 0.0, 0.0, 11.0, 137.0, 98.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 0.0, 0.0, 0.0, 170.0, 161.57664233576642, 149.16326530612244 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 11.0, 137.0, 98.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 157.0081300813008 ]
        } ]
      }, {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "AppDyn" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 0.0, 0.0, 0.0, 3.0, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 0.0, 0.0, 0.0, 39.333333333333336, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 39.333333333333336 ]
        } ]
      } ]
    }
  }
}

and would like to calculate the average of all values under: "name" : "avg(total_response_time)" for each identifieriteratively.

I have tried a few attempts and I really don't know how to proceed as the count of the identifiers and the avg(total_response_time) varies.

for identifier in $(cat response4.json | jq -r  '.[].stats.data[].identifier.values' | sed 's/[][]//g' | sed

's/"//g'); do echo ${identifier}

avg_response_time=$(cat response4.json | jq -r  '.[].stats.data[].metric[]') #don't know how to iterate through the 
done

Any help/idea will be greatly appreciated.

Upvotes: 0

Views: 3625

Answers (2)

peak
peak

Reputation: 116910

First, for clarity, here is a stream-oriented helper function:

def average(s): 
  reduce s as $x (null; .sum += $x | .n += 1)
  | if . == null then null else .sum / .n end;

Next, we have a choice. We can either treat each item in the .stats.data array individually, or we can group the items by the value of .identifier. In the example, the results would be the same (except possibly for the ordering), but let us consider here the two cases separately:

Average per item in .stats.data

.Response.stats.data[]
| {id: (.identifier.values),
   average: average(.metric[]
     | select(.name == "avg(total_response_time)")
     | .values[]) }

Group by .identifier

.Response.stats.data
| group_by(.identifier)[]
| {id: (.[0].identifier.values),
   average: (.[].metric[] 
     | select(.name == "avg(total_response_time)") 
     | .values[] ) }

Output

{"id":["test"],"average":354.43398004304896}
{"id":["test2"],"average":104.14989963503649}
{"id":["appdyn"],"average":80.1233179403148}
{"id":["AppDyn"],"average":6.555555555555556}

Upvotes: 1

Charles Duffy
Charles Duffy

Reputation: 295678

jq -r '
    .[].stats.data[]
    | (.identifier.values[]) as $identifier
    | (.metric[]
       | select(.name == "avg(total_response_time)")
       | .values
      ) as $values
    | [$identifier, ($values | add) / ($values | length)]
    | @tsv
    ' <test.json

...yields:

test    354.43398004304896
test2   104.14989963503649
appdyn  80.1233179403148
AppDyn  6.555555555555556

Upvotes: 0

Related Questions