Reputation: 21
I'm trying to download a json file, save the file, and iterate through the json file in order to extract all information and save it is variables. I'm then going to format a message in csv format to send the data to another system. My problem is the json data. It appears to be a dictionary within a list and I'm not sure how to process it.
Here's the json:
[ {
"ipAddress" : "",
"feedDescription" : "Botted Node Feed",
"bnFeedVersion" : "1.1.4",
"generatedTs" : "2013-08-01 12:00:10.360+0000",
"count" : 642903,
"firstDiscoveredTs" : "2013-07-21 19:07:20.627+0000",
"lastDiscoveredTs" : "2013-08-01 00:34:41.052+0000",
"threatType" : "BN",
"confidence" : 82,
"discoveryMethod" : "spamtrap",
"indicator" : true,
"supportingData" : {
"behavior" : "spamming",
"botnetName" : null,
"spamtrapData" : {
"uniqueSubjectCount" : 88
},
"p2pData" : {
"connect" : null,
"port" : null
}
}
}, {
"ipAddress" : "",
"feedDescription" : "Botted Node Feed",
"bnFeedVersion" : "1.1.4",
"generatedTs" : "2013-08-01 12:00:10.360+0000",
"count" : 28,
"firstDiscoveredTs" : "2013-07-19 03:19:08.622+0000",
"lastDiscoveredTs" : "2013-08-01 01:44:04.009+0000",
"threatType" : "BN",
"confidence" : 40,
"discoveryMethod" : "spamtrap",
"indicator" : true,
"supportingData" : {
"behavior" : "spamming",
"botnetName" : null,
"spamtrapData" : {
"uniqueSubjectCount" : 9
},
"p2pData" : {
"connect" : null,
"port" : null
}
}
}, {
"ipAddress" : "",
"feedDescription" : "Botted Node Feed",
"bnFeedVersion" : "1.1.4",
"generatedTs" : "2013-08-01 12:00:10.360+0000",
"count" : 160949,
"firstDiscoveredTs" : "2013-07-16 18:52:33.881+0000",
"lastDiscoveredTs" : "2013-08-01 03:14:59.452+0000",
"threatType" : "BN",
"confidence" : 82,
"discoveryMethod" : "spamtrap",
"indicator" : true,
"supportingData" : {
"behavior" : "spamming",
"botnetName" : null,
"spamtrapData" : {
"uniqueSubjectCount" : 3
},
"p2pData" : {
"connect" : null,
"port" : null
}
}
} ]
My code:
download = 'https:URL.BNfeed20130801.json'
request = requests.get(download, verify=False)
out = open(fileName, 'w')
for row in request:
if row.strip():
for column in row:
out.write(column)
else:
continue
out.close()
time.sleep(4)
jsonRequest = request.json()
for item in jsonRequest:
print jsonRequest[0]['ipAddress']
print jsonRequest[item]['ipAddress'] --I also tried this
When I do the above it just prints the same IP over and over again. I've put in the print statement for testing purposes only. Once I figure out to to access the different elements of the JSON I will store it in variables and then use these variables accordingly. Any help is greatly appreciated.
Thanks in advance for any help. I'm using Python 2.6 on Linux.
Upvotes: 2
Views: 951
Reputation: 365587
alecxe's answer tells you how to fix this, but let me try to explain what's wrong with the original code.
It may be easier to understand with a simpler example, once you can run through an interactive visualizer:
a = ['a', 'b', 'c']
When you do this:
for item in a:
item
will be 'a'
the first time through, then 'b'
, then 'c'
.
But if you do this:
for item in a:
print a[0]
… you're completely ignoring item
. It's just going to print a
3 times, because each time you go through the loop, you're just asking for a[0]
—that is, the first thing in a
.
And if you do this:
for item in a:
print a[item]
… it's going to raise an exception, because you're asking for the 'a'
th thing in the list, which is nonsense.
But in this code:
for item in a:
print item
… you'll print 'a'
, then 'b'
, then 'c'
, which is exactly what you want.
You could also do this:
for index, item in enumerate(a):
print a[index]
… but that's silly. If you need the index, use enumerate
, but if you just need the item itself… you've already got it.
So, back to your real code:
for item in jsonRequest:
print jsonRequest[0]['ipAddress']
Again, you're ignoring item
and asking for jsonRequest[0]
each time.
And in this code:
for item in jsonRequest:
print jsonRequest[item]['ipAddress'] --I also tried this
… you're asking for the {complicated dictionary}
th thing in jsonRequest
, which is again nonsense.
But in this code:
for item in jsonRequest:
print item['ipAddress']
You're using each item, just as in the simple example.
Upvotes: 2
Reputation: 473753
You are basically iterating over list of dicts, try just item['ipAddress']
.
Upvotes: 4