Reputation: 3760
I'm parsing complex JSON data with Python. The JSON data looks like the following:
{
"data": [{
"product_sn": "ABP-145",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 10
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
},
{
"step_name": "step_c",
"progress": {
"total_steps": 15,
"finished_steps": 15
}
}
]
},
{
"product_sn": "ABP-146",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 8
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
}]
}]
}
The business scenario is: to produce the product, we have several steps: step_a, step_b, and step_c. To start step_c, the prerequisite is:
Now I want to get all the product_sn which are ready to start step_c.
Currently I'm using several nested 'for' loop to handle the "nested Dictionary and List" object created by json.loads(). The code is long and complex and hard to maintain. I'm wondering if there is a simple way like 'JSONPath' to do it with something like:
get(
value=data.product_sn,
criteria=(
data.process_data(step_name=="step_a").
progress(total_steps".value == "finished_steps".value) and
$not_exist data.process_data.step_name=="step_c"
)
)
So I can get all the product_sn matching the searching condition.
I searched the examples and tried jsonpath_ng, jsonpath_rw, but the examples are very simple. Could anyone let me know how to implement the above query with some simple way? I really don't want to use the long, complex and ugly nested 'for' loop anymore.
You may also find below my code for handling this JSON (actually is has been simplified a lot to explain my question, the actual business is far more complex):
import json
json_str = '''{
"data": [{
"product_sn": "ABP-145",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 10
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
},
{
"step_name": "step_c",
"progress": {
"total_steps": 15,
"finished_steps": 15
}
}
]
},
{
"product_sn": "ABP-146",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 8
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
}]
},
{
"product_sn": "ABP-147",
"process_data": [{
"step_name": "step_a",
"progress": {
"total_steps": 10,
"finished_steps": 10
}
},
{
"step_name": "step_b",
"progress": {
"total_steps": 9,
"finished_steps": 6
}
}]
}]
}'''
json_obj = json.loads(json_str)
valid_products = list()
for product in json_obj.get('data'):
product_sn = product['product_sn']
process_data = product.get("process_data")
if not process_data:
continue
valid_product = False
for step in process_data:
step_name = step['step_name']
if step_name == 'step_c':
valid_product = False
break
elif step_name == 'step_a':
progress = step['progress']
if progress['total_steps'] == progress['finished_steps']:
valid_product = True
else:
valid_product = False
break
if valid_product:
valid_products.append(product_sn)
else:
continue
print(valid_products)
Upvotes: 0
Views: 336
Reputation: 4298
You could use a more functional approach to make that a bit cleaner.
from operator import itemgetter
json_obj = json.loads(json_str)
products = json_obj.get("data")
valid_products = filter(
lambda p: "process_data" in p and
p["process_data"]["step_name"] == "step_a" and
p["process_data"]["step_name"]["progress"]["total_steps"] == p["process_data"]["step_name"]["progress"]["finished_steps"],
products
)
valid_product_sns = map(itemgetter("product_sn"), valid_products)
Of course, that filtering lambda is still pretty ugly.
Upvotes: 0
Reputation: 1384
Assuming your JSON object is stored in o
variable:
prods = [p['product_sn'] for p in o['data'] if [a for a in p['process_data'] if a['step_name']=="step_a" and a['progress']['total_steps']==a['progress']['finished_steps']] and not [c for c in p['process_data'] if c['step_name']=="step_c"]]
Sorry for a long one-liner, I do not have PyCharm at my hand atm to break it into several lines so it looks good.
You can check working code here:link to repl.it
Upvotes: 1