Reputation: 61
I am receiving this error:
Traceback (most recent call last):
File "/Users/Rose/Documents/workspace/METProjectFOREAL/src/test_met4.py", line 79, in <module>
table_list.append(table_template % art_temp_dict)
KeyError: 'artifact4'
from this code:
artifact_groups = grouper(4, html_list, "")
for artifact_group in artifact_groups:
art_temp_dict={}
for artifact in artifact_group:
art_temp_dict["artifact"+str(artifact_group.index(artifact)+1)] = artifact
table_list.append(table_template % art_temp_dict)
Here is a sample of the CSV:
"artifact4971.jpg","H. 17 1/2 x 16 1/2 x 5 1/2 in. (44.5 x 41.9 x 14 cm)","74.51.2648","4971" "artifact4972.jpg","Overall: 5 1/2 x 3 3/4 x 4 in. (14.0 x 9.5 x 10.2 cm)","74.51.2592","4972" "artifact4973.jpg","Overall: 6 5/8 x 7 1/4 x 1 1/4 in. (16.8 x 18.4 x 3.2 cm)","74.51.2594","4973" "artifact4974.jpg","H. 5 1/2 x 6 3/4 x 11 3/4 in. (14 x 17.1 x 29.8 cm)","74.51.2628","4974" "artifact4975.jpg","Overall: 10 1/8 7 7 in. (25.7 cm)","74.51.2633","4975" "artifact4976.jpg","Overall: 7 1/2 5 11 1/2 in. (19.1 12.7 29.2 cm)","74.51.2637","4976" "artifact4977.jpg","Overall: 10 1/2 7 8 1/2 in. (26.7 17.8 21.6 cm)","74.51.2819","4977" "artifact4978.jpg","H. 6 3/8 x 14 1/2 x 5 1/4 in. (16.2 x 36.8 x 13.3 cm)","74.51.2831","4978"
I understand that the KeyError signifies that 'artifact4' does not exist, but I don't know why - I am taking data from a large CSV file with almost 6,000 records. Any suggestions greatly appreciated!
Upvotes: 2
Views: 2983
Reputation: 365697
You could make this a lot simpler by using csv.DictReader
instead of using csv.reader
and then trying to generate a dict
out of each row:
>>> s='''"artifact4971.jpg","H. 17 1/2 x 16 1/2 x 5 1/2 in. (44.5 x 41.9 x 14 cm)","74.51.2648","4971"
... "artifact4972.jpg","Overall: 5 1/2 x 3 3/4 x 4 in. (14.0 x 9.5 x 10.2 cm)","74.51.2592","4972"
... "artifact4973.jpg","Overall: 6 5/8 x 7 1/4 x 1 1/4 in. (16.8 x 18.4 x 3.2 cm)","74.51.2594","4973"'''
>>> reader = csv.DictReader(s.splitlines(),
... ('artifact1', 'artifact2', 'artifact3', 'artifact4'))
>>> list(reader)
[{'artifact1': 'artifact4971.jpg',
'artifact2': 'H. 17 1/2 x 16 1/2 x 5 1/2 in. (44.5 x 41.9 x 14 cm)',
'artifact3': '74.51.2648',
'artifact4': '4971'},
{'artifact1': 'artifact4972.jpg',
'artifact2': 'Overall: 5 1/2 x 3 3/4 x 4 in. (14.0 x 9.5 x 10.2 cm)',
'artifact3': '74.51.2592',
'artifact4': '4972'},
{'artifact1': 'artifact4973.jpg',
'artifact2': 'Overall: 6 5/8 x 7 1/4 x 1 1/4 in. (16.8 x 18.4 x 3.2 cm)',
'artifact3': '74.51.2594',
'artifact4': '4973'}]
If you really want to build each row dict yourself, it's harder to get wrong if you use a dict comprehension.
The declarative structure strongly encourages you to think about this properly. If you know about enumerate
you'll probably write something like this:
art_temp_dict={'artifact'+str(i+1): artifact
for i, artifact in enumerate(artifact_group)}
… and if not, something like this—uglier, but still correct:
art_temp_dict={'artifact'+str(i+1): artifact_group[i]
for i in len(artifact_group)}
… rather than trying to recover the index by searching.
Upvotes: 3
Reputation: 308130
If you ever have a situation where the fourth column of the CSV has the same value as one of the earlier columns, the index
will produce the earlier match and artifact4
will never be populated. Use this instead:
for i, artifact in enumerate(artifact_group):
art_temp_dict["artifact"+str(i+1)] = artifact
Upvotes: 3