Reputation: 162
I have a json object that consists of one object with key 'data', that has values listed in a set of arrays. I need to return all arrays that contain the value x, but the arrays themselves do not have keys. I'm trying to write a script to enter a source file (inFile) an define an export file (outFile). Here is my data structure:
{ "data": [
["x", 1, 4, 6, 2, 7],
["y", 3, 2, 5, 8, 4],
["z", 5, 2, 5, 9, 9],
["x", 3, 7, 2, 6, 8]
]
}
And here is my current script:
import json
def jsonFilter( inFile, outFile ):
out = None;
with open( inFile, 'r') as jsonFile:
d = json.loads(json_data)
a = d['data']
b = [b for b in a if b != 'x' ]
del b
out = a
if out:
with open( outFile, 'w' ) as jsonFile:
jsonFile.write( json.dumps( out ) );
else:
print "Error creating new jsonFile!"
SOLUTION
Thanks to Rob and everyone for your help! Here's the final working command-line tool. This takes two arguments: inFile and Outfile. ~$ python jsonFilter.py inFile.json outFile.json
import json
def jsonFilter( inFile, outFile ):
# make a dictionary.
out = {};
with open( inFile, 'r') as jsonFile:
json_data = jsonFile.read()
d = json.loads(json_data)
# build the data you want to save to look like the original
# by taking the data in the d['data'] element filtering what you want
# elements where b[0] is 'x'
out['data'] = [b for b in d['data'] if b[0] == 'x' ]
if out:
with open( outFile, 'w' ) as jsonFile:
jsonFile.write( json.dumps( out ) );
else:
print "Error creating new JSON file!"
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('inFile', nargs=1, help="Choose the in file to use")
parser.add_argument('outFile', nargs=1, help="Choose the out file to use")
args = parser.parse_args()
jsonFilter( args.inFile[0] , args.outFile[0] );
Upvotes: 1
Views: 3652
Reputation: 2656
First problem the query string will be true for everything (aka return the whole data set back since you are comparing b (a list) to 'x' a string
b = [b for b in a if b != 'x' ]
What you wanted to do was:
b = [b for b in a if b[0] != 'x' ]
The second problem is you are trying to delete the data by querying and deleting the results. Since the results contain a copy that will not delete anything from the original container.
Instead build the new data with only the elements you want, and save those. Also you were not recreating the 'data' element in your out data, so the json so the output have the same structure as the input data.
import json
def jsonFilter( inFile, outFile ):
# make a dictionary instead.
out = {};
with open( inFile, 'r') as jsonFile:
json_data = jsonFile.read()
d = json.loads(json_data)
# build the data you want to save to look like the original
# by taking the data in the d['data'] element filtering what you want
# elements where b[0] is 'x'
out['data'] = [b for b in d['data'] if b[0] == 'x' ]
if out:
with open( outFile, 'w' ) as jsonFile:
jsonFile.write( json.dumps( out ) );
else:
print "Error creating new jsonFile!"
output json data looks like:
'{"data": [["x", 1, 4, 6, 2, 7], ["x", 3, 7, 2, 6, 8]]}'
If you did not want the output to have the 'data' root element but just the array of data that matched your filter then change the line:
out['data'] = [b for b in d['data'] if b[0] == 'x' ]
to
out = [b for b in d['data'] if b[0] == 'x' ]
with this change the output json data looks like:
'[["x", 1, 4, 6, 2, 7], ["x", 3, 7, 2, 6, 8]]'
Upvotes: 2
Reputation: 9863
So, basically you want to filter out your input data containing arrays whose first element is 'x', maybe something like this will do:
import json
def jsonFilter(inFile, outFile):
with open(inFile, 'r') as jsonFile:
d = json.loads(json_data)
out = {
'data': filter(lambda x: x[0] == 'x', d['data'])
}
if out['data']:
with open(outFile, 'w') as jsonFile:
jsonFile.write(json.dumps(out))
else:
print "Error creating new jsonFile!"
Upvotes: 1