Bob
Bob

Reputation: 1396

How to get a value from JSON

This is the first time I'm working with JSON, and I'm trying to pull url out of the JSON below.

{
    "name": "The_New11d112a_Company_Name",
    "sections": [
        {
            "name": "Products",
            "payload": [
                {
                    "id": 1,
                    "name": "TERi Geriatric Patient Skills Trainer,
                    "type": "string"
                }
            ]
        },
        {
            "name": "Contact Info",
            "payload": [
                {
                    "id": 1,
                    "name": "contacts",
                    "url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
                    "contacts": [
                        {
                        "name": "User",
                        "email": "Company Email",
                        "phone": "Company PhoneNumber"
                    }
                ],
                "type": "contact"
            }
        ]
    }
],
"tags": [
    "Male",
    "Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"

}

I have been able to access description and _id via

data = json.loads(line)
        if 'xpath' in data:
            xpath = data["_id"]
            description = data["sections"][0]["payload"][0]["description"]

However, I can't seem to figure out a way to access url. One other issue I have is there could be other items in sections, which makes indexing into Contact Info a non starter.

Upvotes: 1

Views: 240

Answers (5)

CommonFool
CommonFool

Reputation: 13

This should print out the URL

import json
# open json file to read
with open('test.json','r') as f:
    # load json, parameter as json text (file contents)
    data = json.loads(f.read())
    # after observing format of JSON data, the location of the URL key
    # is determined and the data variable is manipulated to extract the value
    print(data['sections'][1]['payload'][0]['url'])

The exact location of the 'url' key:

1st (position) of the array which is the value of the key 'sections'

Inside the array value, there is a dict, and the key 'payload' contains an array

In the 0th (position) of the array is a dict with a key 'url'


While testing my solution, I noticed that the json provided is flawed, after fixing the json flaws(3), I ended up with this.

{
"name": "The_New11d112a_Company_Name",
"sections": [
    {
        "name": "Products",
        "payload": [
            {
                "id": 1,
                "name": "TERi Geriatric Patient Skills Trainer",
                "type": "string"
            }
        ]
    },
    {
        "name": "Contact Info",
        "payload": [
            {
                "id": 1,
                "name": "contacts",
                "url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
                "contacts": [
                    {
                    "name": "User",
                    "email": "Company Email",
                    "phone": "Company PhoneNumber"
                }
            ],
            "type": "contact"
        }
    ]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"}

Upvotes: 1

Deyuan
Deyuan

Reputation: 56

After utilizing the JSON that was provided by Vincent55. I made a working code with exception handling and with certain assumptions.

Working Code:

## Assuming that the target data is always under sections[i].payload
from json import loads
line = open("data.json").read()
data = loads(line)["sections"]
for x in data:
    try:
        # With assumption that there is only one payload
        if x["payload"][0]["url"]:
            print(x["payload"][0]["url"])
    except KeyError:
        pass

Upvotes: 0

ThangTD
ThangTD

Reputation: 1684

I think it's worth to write a short function to get the url(s) and make a decision whether or not to use the first found url in the returned list, or skip processing if there's no url available in your data.

The method shall looks like this:

def extract_urls(data):
    payloads = []
    for section in data['sections']:
        payloads += section.get('payload') or []

    urls = [x['url'] for x in payloads if 'url' in x]

    return urls

Upvotes: 1

Vincent55
Vincent55

Reputation: 33

I think your JSON is damaged, it should be like that.

{
    "name": "The_New11d112a_Company_Name",
    "sections": [
        {
            "name": "Products",
            "payload": [
                {
                    "id": 1,
                    "name": "TERi Geriatric Patient Skills Trainer",
                    "type": "string"
                }
            ]
        },
        {
            "name": "Contact Info",
            "payload": [
                {
                    "id": 1,
                    "name": "contacts",
                    "url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
                    "contacts": [
                        {
                        "name": "User",
                        "email": "Company Email",
                        "phone": "Company PhoneNumber"
                    }
                ],
                "type": "contact"
            }
        ]
    }
],
"tags": [
    "Male",
    "Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"
}

You can check it on http://json.parser.online.fr/.

And if you want to get the value of the url.

import json
j = json.load(open('yourJSONfile.json'))
print(j['sections'][1]['payload'][0]['url'])

Upvotes: 1

tomarv2
tomarv2

Reputation: 823

Hope this helps:

import json

with open("test.json", "r") as f:
    json_out = json.load(f)
    for i in json_out["sections"]:
        for j in i["payload"]:
            for key in j:
                if "url" in key:
                    print(key, '->', j[key])

Upvotes: 1

Related Questions