Termite
Termite

Reputation: 76

Complex JMESPath filter on a large JSON file

Please consider the following JSON extract (the data is much larger but this is a shorter piece of it i'm trying to get to work)

jsonData = """{
  "products" : {
    "DQ578CGN99KG6ECF" : {
      "sku" : "DQ578CGN99KG6ECF",
      "productFamily" : "Compute",
      "attributes" : {
        "location" : "US East (N. Virginia)",
        "instanceType" : "hs1.8xlarge",
        "tenancy" : "Shared",
        "operatingSystem" : "Windows",
        "licenseModel" : "License Included",
        "preInstalledSw" : "NA"
      }
    },
    "G2N9F3PVUVK8ZTGP" : {
      "sku" : "G2N9F3PVUVK8ZTGP",
      "productFamily" : "Instance",
      "attributes" : {
        "location" : "Asia Pacific (Seoul)",
        "instanceType" : "i2.xlarge",
        "tenancy" : "Host",
        "operatingSystem" : "Windows",
        "licenseModel" : "License Included",
        "preInstalledSw" : "SQL Server Enterprise"
      }
    },
    "FBZZ2TKXWWY5HZRX" : {
      "sku" : "FBZZ2TKXWWY5HZRX",
      "productFamily" : "Compute",
      "attributes" : {
        "location" : "Asia Pacific (Seoul)",
        "instanceType" : "i2.4xlarge",
        "tenancy" : "Dedicated",
        "operatingSystem" : "SUSE",
        "licenseModel" : "No License required",
        "preInstalledSw" : "NA"
      }
    }
  }
}"""

I'm not able to create a proper filter to find say all products with "Windows" as operatingSystem and tenancy shared.

I got to this point:

priceJson = json.loads(jsonData)
query = "products.*.attributes[?operatingSystem=='Windows' && tenancy=='Shared']"
output_dict = jmespath.search(query, priceJson)

however i loose the sku # this way.

Result:

[{        
        "location" : "US East (N. Virginia)",
        "instanceType" : "hs1.8xlarge",
        "tenancy" : "Shared",
        "operatingSystem" : "Windows",
        "licenseModel" : "License Included",
        "preInstalledSw" : "NA"
}]

What i'd like to get:

[
  { "sku": "DQ578CGN99KG6ECF",
    "attributes" : {
        "location" : "US East (N. Virginia)",
        "instanceType" : "hs1.8xlarge",
        "tenancy" : "Shared",
        "operatingSystem" : "Windows",
        "licenseModel" : "License Included",
        "preInstalledSw" : "NA"
    }
}]

Any idea how to get to that result ?

Upvotes: 4

Views: 3890

Answers (2)

Kris
Kris

Reputation: 61

You could've done it with a single query:

products.*.{\"attributes\":attributes,\"sku\":sku}[?attributes.operatingSystem==`Windows` && attributes.tenancy==`Shared`]

Upvotes: 1

Termite
Termite

Reputation: 76

Well I continued to look for an answer on this and i finally succeeded to get to my result !

The key was to do this in two steps :)

This is the code i use now:

#!/usr/bin/env python
try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2's urllib2
    from urllib2 import urlopen

import json, jmespath

jsonData = """{
  "products" : {
    "DQ578CGN99KG6ECF" : {
      "sku" : "DQ578CGN99KG6ECF",
      "productFamily" : "Compute",
      "attributes" : {
        "location" : "US East (N. Virginia)",
        "instanceType" : "hs1.8xlarge",
        "tenancy" : "Shared",
        "operatingSystem" : "Windows",
        "licenseModel" : "License Included",
        "preInstalledSw" : "NA"
      }
    },
    "G2N9F3PVUVK8ZTGP" : {
      "sku" : "G2N9F3PVUVK8ZTGP",
      "productFamily" : "Instance",
      "attributes" : {
        "location" : "Asia Pacific (Seoul)",
        "instanceType" : "i2.xlarge",
        "tenancy" : "Host",
        "operatingSystem" : "Windows",
        "licenseModel" : "License Included",
        "preInstalledSw" : "SQL Server Enterprise"
      }
    },
    "FBZZ2TKXWWY5HZRX" : {
      "sku" : "FBZZ2TKXWWY5HZRX",
      "productFamily" : "Compute",
      "attributes" : {
        "location" : "Asia Pacific (Seoul)",
        "instanceType" : "i2.4xlarge",
        "tenancy" : "Dedicated",
        "operatingSystem" : "SUSE",
        "licenseModel" : "No License required",
        "preInstalledSw" : "NA"
      }
    }
  }
}"""

priceJson = json.loads(jsonData)

query = "products.*.{sku: sku, location: attributes.location, instanceType: attributes.instanceType, tenancy: attributes.tenancy, operatingSystem: attributes.operatingSystem, licenseModel: attributes.licenseModel, preInstalledSw: attributes.preInstalledSw}"
output_dict = jmespath.search(query, priceJson)

query2 = "[?operatingSystem=='Windows' && tenancy=='Shared']"
output_dict = jmespath.search(query2, output_dict)

print(output_dict)

and the result:

[
  {
    "preInstalledSw": "NA",
    "location": "US East (N. Virginia)",
    "sku": "DQ578CGN99KG6ECF",
    "operatingSystem": "Windows",
    "tenancy": "Shared",
    "instanceType": "hs1.8xlarge",
    "licenseModel": "License Included"
  }
]

Upvotes: 2

Related Questions