Reputation: 887
I've been working on this script today and have made some really good progress with looping through the data and importing it to an external database. I'm trying to troubleshoot a field that I'm having an issue with and it doesn't make much sense. Whenever I attempt to run it, I get the following error KeyError: 'manufacturer'
. If I comment out the line product_details['manufacturer'] = item['manufacturer']
, the script runs as it should.
Not sure what else to check or where to go from here (new to python)
I'm using the following test data
import json
input_file = open ('data/bestbuy_seo.json')
json_array = json.load(input_file)
product_list = []
for item in json_array:
product_details = {"name": None, "shortDescription": None, "bestSellingRank": None,
"thumbnailImage": None, "salePrice": None, "manufacturer": None, "url": None,
"type": None, "image": None, "customerReviewCount": None, "shipping": None,
"salePrice_range": None, "objectID": None, "categories": [None] }
product_details['name'] = item['name']
product_details['shortDescription'] = item['shortDescription']
product_details['bestSellingRank'] = item['bestSellingRank']
product_details['thumbnailImage'] = item['thumbnailImage']
product_details['salePrice'] = item['salePrice']
product_details['manufacturer'] = item['manufacturer']
product_details['url'] = item['url']
product_details['type'] = item['type']
product_details['image'] = item['image']
product_details['customerReviewCount'] = item['customerReviewCount']
product_details['shipping'] = item['shipping']
product_details['salePrice_range'] = item['salePrice_range']
product_details['objectID'] = item['objectID']
product_details['categories'] = item['categories']
product_list.append(product_details)
# Let's dump it to the screen to see if it works
print json.dumps(product_list, indent=4)
Upvotes: 0
Views: 1453
Reputation: 2829
Not quite the issue at hand (item is missing key manufacturer
, perhaps more), but since you're just copying fields with the exact same keys, you can write something like this. Also note that item.get(key, None)
will rid you of this error at the cost of having None
values in product (so if you like your code to fail hard when it fails, this may be bad)
import json
input_file = open ('data/bestbuy_seo.json')
json_array = json.load(input_file)
product_list = []
product_keys = ('objectID', 'image', 'thumbnailImage',
'shortDescription', 'categories', 'manufacturer',
'customerReviewCount', 'name', 'url', 'shipping',
'salePrice', 'bestSellingRank', 'type',
'salePrice_range')
for item in json_array:
product_list.append(dict((key, item.get(key, None)) for key in product_keys))
# Let's dump it to the screen to see if it works
print json.dumps(product_list, indent=4)
Upvotes: 1
Reputation: 868
Here are two ways of doing getting around a dictionary not having a key. Both work but the first one is probably easier to use and will work as a drop in for your current code.
This is a way of doing it using python's dictionary.get()
method. Here is a page with more examples of how it works. This method of solving the problem was inspired by this answer by Ian A. Mason
to the current question. I changed your code inspired by his answer.
import json
input_file = open('data/bestbuy_seo.json')
json_array = json.load(input_file)
product_list = []
for item in json_array:
product_details = {
'name': item.get('name', None),
'shortDescription': item.get('shortDescription', None),
'bestSellingRank': item.get('bestSellingRank', None),
'thumbnailImage': item.get('thumbnailImage', None),
'salePrice': item.get('salePrice', None),
'manufacturer': item.get('manufacturer', None),
'url': item.get('url', None),
'type': item.get('type', None),
'image': item.get('image', None),
'customerReviewCount': item.get('customerReviewCount', None),
'shipping': item.get('shipping', None),
'salePrice_range': item.get('salePrice_range', None),
'objectID': item.get('objectID', None),
'categories': item.get('categories', None)
}
product_list.append(product_details)
# Let's dump it to the screen to see if it works
print json.dumps(product_list, indent=4)
This is a second way of doing it using the 'Ask for forgiveness not permission' concept in python. It is easy to just let the one object that is missing an attribute fail and keep going. It is a lof faster to do a try and expect than a bunch of if's.
Here is a post about this concept.
import json
from copy import deepcopy
input_file = open('data/bestbuy_seo.json')
json_array = json.load(input_file)
product_list = []
product_details_master = {"name": None, "shortDescription": None, "bestSellingRank": None,
"thumbnailImage": None, "salePrice": None, "manufacturer": None, "url": None,
"type": None, "image": None, "customerReviewCount": None, "shipping": None,
"salePrice_range": None, "objectID": None, "categories": [None]}
for item in json_array:
product_details_temp = deepcopy(product_details_master)
try:
product_details_temp['name'] = item['name']
product_details_temp['shortDescription'] = item['shortDescription']
product_details_temp['bestSellingRank'] = item['bestSellingRank']
product_details_temp['thumbnailImage'] = item['thumbnailImage']
product_details_temp['salePrice'] = item['salePrice']
product_details_temp['manufacturer'] = item['manufacturer']
product_details_temp['url'] = item['url']
product_details_temp['type'] = item['type']
product_details_temp['image'] = item['image']
product_details_temp['customerReviewCount'] = item['customerReviewCount']
product_details_temp['shipping'] = item['shipping']
product_details_temp['salePrice_range'] = item['salePrice_range']
product_details_temp['objectID'] = item['objectID']
product_details_temp['categories'] = item['categories']
product_list.append(product_details_temp)
except KeyError:
# Add error handling here! Right now if a product does not have all the keys NONE of the current object
# will be added to the product_list!
print 'There was a missing key in the json'
# Let's dump it to the screen to see if it works
print json.dumps(product_list, indent=4)
Upvotes: 0
Reputation: 219
my guess is that one of the items in your data does not have a 'manufacturer' key set.
replace
item['manufacturer']
by
item.get('manufacturer', None)
or replace None by a default manufacturer...
Upvotes: 1