Thiousi
Thiousi

Reputation: 63

Filter Json based on nested value

I have been struggling with filtering a json file and tried multiple solutions to no success.

My json looks like this :

{
  "some site": {
    "https://url.com/123...": {
      "Product Name": "A",
      "Product Price": "1213",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://url.com/456...": {
      "Product Name": "B",
      "Product Price": "59.95",
      "Product Category": "A",
      "Product Availability": "In Stock"
    }
  },
  "some other site": {
    "https://other_url.com/904543...": {
      "Product Name": "C",
      "Product Price": "479.95",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://other_url.com/432489...": {
      "Product Name": "D",
      "Product Price": "5",
      "Product Category": "B",
      "Product Availability": "In Stock"
    }
  }
}

And I would like to filter the entire structure based on the key Product Availability == "In Stock" with an expected result of :

{
  "some site": {
    "https://url.com/456...": {
      "Product Name": "B",
      "Product Price": "59.95",
      "Product Category": "A",
      "Product Availability": "In Stock"
    }
  },
  "some other site": {
    "https://other_url.com/432489...": {
      "Product Name": "D",
      "Product Price": "5",
      "Product Category": "B",
      "Product Availability": "In Stock"
    }
  }
}

I am reading the file using json_load():

def read_json(filename):
    with open(filename, encoding='utf-8') as json_file:
        return json.load(json_file)

Minimal reproducible example :

import json

data = """
{
  "some site": {
    "https://url.com/123...": {
      "Product Name": "A",
      "Product Price": "1213",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://url.com/456...": {
      "Product Name": "B",
      "Product Price": "59.95",
      "Product Category": "A",
      "Product Availability": "In Stock"
    }
  },
  "some other site": {
    "https://other_url.com/904543...": {
      "Product Name": "C",
      "Product Price": "479.95",
      "Product Category": "A",
      "Product Availability": "Not Available"
    },
    "https://other_url.com/432489...": {
      "Product Name": "D",
      "Product Price": "5",
      "Product Category": "B",
      "Product Availability": "In Stock"
    }
  }
}"""

products = json.loads(data)

output_dict = [x for x,(z) in products.items() if z["Product Availability"] == "In Stock"]
print(output_dict)

Returns KeyError: 'Product Availability'

Any help would be much appreciated!

Upvotes: 1

Views: 2571

Answers (2)

Tomalak
Tomalak

Reputation: 338148

You can do it with a nested dict comprehension, but it's not exactly easy to understand at the first glance:

{key: {s: v for s, v in val.items() if v.get("Product Availability") == "In Stock"} for key, val in data.items()}

which gives:

{
    "some site": {
        "https://url.com/456...": {
            "Product Name": "B",
            "Product Price": "59.95",
            "Product Category": "A",
            "Product Availability": "In Stock"
        }
    },
    "some other site": {
        "https://other_url.com/432489...": {
            "Product Name": "D",
            "Product Price": "5",
            "Product Category": "B",
            "Product Availability": "In Stock"
        }
    }
}

...but realistically, it might be more manageable to use a nested loop:

import json

site_data = json.reads('your JSON...')
result = {}

for site_title, urls in site_data.items():
    result[site_title] = {}
    for url, url_data in urls.items():
        if site_data.get("Product Availability") == "In Stock":
            result[site_title][url] = url_data

The result is the same.

Upvotes: 2

Barmar
Barmar

Reputation: 780724

You need to use a dictionary comprehension to create a dictionary, not a list comprehension.

Product Availablity is a key of the dicts nested inside z, not a key of z itself. You need a nested dict comprehension to filter the products in each site.

output_dict = {site: {
    url: attributes for url, attributes in p.items() if attributes['Product Availability'] == "In Stock"
    } for site, p in products.items()}

This can be written more understandably using regular nested loops.

output_dict = {}
for site, product_dict in products.items():
    output_site = {}
    for url, attributes in product_dict.items():
        if attributes['Product Availability'] == 'In Stock':
            output_site[url] = attributes
    output_dict[site] = output_site

Upvotes: 4

Related Questions