meller
meller

Reputation: 21

JSON Processing Using Python

I'm trying to download a json file, save the file, and iterate through the json file in order to extract all information and save it is variables. I'm then going to format a message in csv format to send the data to another system. My problem is the json data. It appears to be a dictionary within a list and I'm not sure how to process it.

Here's the json:

[ {
  "ipAddress" : "",
  "feedDescription" : "Botted Node Feed",
  "bnFeedVersion" : "1.1.4",
  "generatedTs" : "2013-08-01 12:00:10.360+0000",
  "count" : 642903,
  "firstDiscoveredTs" : "2013-07-21 19:07:20.627+0000",
  "lastDiscoveredTs" : "2013-08-01 00:34:41.052+0000",
  "threatType" : "BN",
  "confidence" : 82,
  "discoveryMethod" : "spamtrap",
  "indicator" : true,
  "supportingData" : {
    "behavior" : "spamming",
    "botnetName" : null,
    "spamtrapData" : {
      "uniqueSubjectCount" : 88
    },
    "p2pData" : {
      "connect" : null,
      "port" : null
    }
  }
}, {
  "ipAddress" : "",
  "feedDescription" : "Botted Node Feed",
  "bnFeedVersion" : "1.1.4",
  "generatedTs" : "2013-08-01 12:00:10.360+0000",
  "count" : 28,
  "firstDiscoveredTs" : "2013-07-19 03:19:08.622+0000",
  "lastDiscoveredTs" : "2013-08-01 01:44:04.009+0000",
  "threatType" : "BN",
  "confidence" : 40,
  "discoveryMethod" : "spamtrap",
  "indicator" : true,
  "supportingData" : {
    "behavior" : "spamming",
    "botnetName" : null,
     "spamtrapData" : {
      "uniqueSubjectCount" : 9
    },
    "p2pData" : {
      "connect" : null,
      "port" : null
    }
  }
}, {
  "ipAddress" : "",
  "feedDescription" : "Botted Node Feed",
  "bnFeedVersion" : "1.1.4",
  "generatedTs" : "2013-08-01 12:00:10.360+0000",
  "count" : 160949,
  "firstDiscoveredTs" : "2013-07-16 18:52:33.881+0000",
  "lastDiscoveredTs" : "2013-08-01 03:14:59.452+0000",
  "threatType" : "BN",
  "confidence" : 82,
   "discoveryMethod" : "spamtrap",
  "indicator" : true,
  "supportingData" : {
    "behavior" : "spamming",
    "botnetName" : null,
    "spamtrapData" : {
      "uniqueSubjectCount" : 3
    },
     "p2pData" : {
      "connect" : null,
       "port" : null
    }
  }
 } ]

My code:

download = 'https:URL.BNfeed20130801.json'

request = requests.get(download, verify=False)
out  = open(fileName, 'w')
for row in request:
    if row.strip():
         for column in row:
                 out.write(column)
    else:
        continue
out.close()
time.sleep(4)
jsonRequest = request.json()

for item in jsonRequest:
     print jsonRequest[0]['ipAddress']
     print jsonRequest[item]['ipAddress'] --I also tried this

When I do the above it just prints the same IP over and over again. I've put in the print statement for testing purposes only. Once I figure out to to access the different elements of the JSON I will store it in variables and then use these variables accordingly. Any help is greatly appreciated.

Thanks in advance for any help. I'm using Python 2.6 on Linux.

Upvotes: 2

Views: 951

Answers (2)

abarnert
abarnert

Reputation: 365587

alecxe's answer tells you how to fix this, but let me try to explain what's wrong with the original code.


It may be easier to understand with a simpler example, once you can run through an interactive visualizer:

a = ['a', 'b', 'c']

When you do this:

for item in a:

item will be 'a' the first time through, then 'b', then 'c'.

But if you do this:

for item in a:
    print a[0]

… you're completely ignoring item. It's just going to print a 3 times, because each time you go through the loop, you're just asking for a[0]—that is, the first thing in a.

And if you do this:

for item in a:
    print a[item]

… it's going to raise an exception, because you're asking for the 'a'th thing in the list, which is nonsense.

But in this code:

for item in a:
    print item

… you'll print 'a', then 'b', then 'c', which is exactly what you want.

You could also do this:

for index, item in enumerate(a):
    print a[index]

… but that's silly. If you need the index, use enumerate, but if you just need the item itself… you've already got it.


So, back to your real code:

for item in jsonRequest:
     print jsonRequest[0]['ipAddress']

Again, you're ignoring item and asking for jsonRequest[0] each time.

And in this code:

for item in jsonRequest:
    print jsonRequest[item]['ipAddress'] --I also tried this

… you're asking for the {complicated dictionary}th thing in jsonRequest, which is again nonsense.

But in this code:

for item in jsonRequest:
    print item['ipAddress']

You're using each item, just as in the simple example.

Upvotes: 2

alecxe
alecxe

Reputation: 473753

You are basically iterating over list of dicts, try just item['ipAddress'].

Upvotes: 4

Related Questions