Rao572
Rao572

Reputation: 19

Iterate through Json Data to get URL's

I'm trying to parse the following JSON data and get URL values using python Function.From the below JSON example I would like to get the URL from under the Jobs tag and store it in 2 arrays. 1 array will store URL that has color tag and other will store URL that do not have color tag. Once the 2 arrays are ready I would like to return these two arrays. I'm very new to python and need some help with this.

    {  
        "_class":"com.cloudbees.hudson.plugins.folder.Folder",
        "actions":[  ],
        "description":"This is a TSG level folder.",
        "displayName":"CONSOLIDATED",
        "displayNameOrNull":null,
        "fullDisplayName":"CONSOLIDATED",
        "fullName":"CONSOLIDATED",
        "name":"CONSOLIDATED",
        "url":"https://cyggm.com/job/CONSOLIDATED/",
        "healthReport":[  
         {  
             "description":"Projects enabled for building: 187 of 549",
             "iconClassName":"icon-health-20to39",
             "iconUrl":"health-20to39.png",
             "score":34
         }
         ],
         "jobs":[  
             {  
                 "_class":"com.cloudbees.hudson.plugins.folder.Folder",
                  "name":"yyfyiff",
                  "url":"https://tdyt.com/job/ 
                   CONSOLIDATED/job/yfiyf/"
             },
            {  
               "_class":"com.cloudbees.hudson.plugins.folder.Folder",
               "name":"Ops-Prod-Jobs",
               "url":"https://ygduey.com/job/ 
                CONSOLIDATED/job/Ops-Prod-Jobs/"
             },
            {  
                "_class":"com.cloudbees.hudson.plugins.folder.Folder",
                "name":"TEST-DATA-MGMT",
                "url":"https://futfu.com/job/ 
               CONSOLIDATED/job/TEST-DATA-MGMT/"
            },
            {  
                "_class":"com.cloudbees.hudson.plugins.folder.Folder",
                "name":"TESTING-OPS",
                  "url":"https://gfutfu.com/job/ 
               CONSOLIDATED/job/TESTING-OPS/"
             },
             {  
                  "_class":"com.cloudbees.hudson.plugins.folder.Folder",
                  "name":"Performance_Engineering Team",
                 "url":"https://ytdyt.com/job/ 
                  CONSOLIDATED/job/Performance_Engineering%20Team/"
            },
              {  
                    "_class":"hudson.model.FreeStyleProject",
                     "name":"test",
                     "url":"https://tduta.com/job/ 
                      CONSOLIDATED/job/test/",
                     "color":"notbuilt"
            }
            ],
              "primaryView":{  
                  "_class":"hudson.model.AllView",
                 "name":"all",
                   "url":"https://fuyfi.com/job/ 
            CONSOLIDATED/"
              },
             "views":[  
             {  
                    "_class":"hudson.model.AllView",
                     "name":"all",
                    "url":"https://utfufu.com/job/
                   CONSOLIDATED/"
              }
              ]
            }

The following is the python code I used to get the jobs data but then I'm not able to iterate through the jobs data to get all URL. I'm only getting 1 at a time if I change the code

    req = requests.get(url, verify=False, auth=(username, password))
    j = json.loads(req.text)
    jobs = j['jobs']
    print(jobs[1]['url'])

I'm getting 2nd URL here but no way to check if this entry has color tag

Upvotes: 0

Views: 211

Answers (1)

aydow
aydow

Reputation: 3801

First of all, your JSON is improperly formatted. You will have to use a JSON formatter to check its validity and fix any issues.

That said, you'll have to read in the file as a string with

In [87]: with open('data.json', 'r') as f:
    ...:     data = f.read()
    ...:

Then using the json library, load the data into a dict

In [88]: d = json.loads(data)

You can then use 2 list comprehensions to get the data you want

In [90]: no_color = [record['url'] for record in d['jobs'] if 'color' not in record]

In [91]: color = [record['url'] for record in d['jobs'] if 'color' in record]

In [93]: no_color
Out[93]:
['https://tdyt.com/job/CONSOLIDATED/job/yfiyf/',
 'https://ygduey.com/job/CONSOLIDATED/job/Ops-Prod-Jobs/',
 'https://futfu.com/job/CONSOLIDATED/job/TEST-DATA-MGMT/',
 'https://gfutfu.com/job/CONSOLIDATED/job/TESTING-OPS/',
 'https://ytdyt.com/job/CONSOLIDATED/job/Performance_Engineering%20Team/']

In [94]: color
Out[94]: ['https://tduta.com/job/CONSOLIDATED/job/test/']

Upvotes: 1

Related Questions