Waverly
Waverly

Reputation: 61

python dictionary key error?

I am receiving this error:

Traceback (most recent call last):
  File "/Users/Rose/Documents/workspace/METProjectFOREAL/src/test_met4.py", line 79, in   <module>
    table_list.append(table_template % art_temp_dict)
KeyError: 'artifact4'

from this code:

artifact_groups = grouper(4, html_list, "")  

for artifact_group in artifact_groups:
    art_temp_dict={}
     for artifact in artifact_group:
         art_temp_dict["artifact"+str(artifact_group.index(artifact)+1)] = artifact

    table_list.append(table_template % art_temp_dict)

Here is a sample of the CSV:

"artifact4971.jpg","H. 17 1/2 x 16 1/2 x 5 1/2 in. (44.5 x 41.9 x 14 cm)","74.51.2648","4971" "artifact4972.jpg","Overall: 5 1/2 x 3 3/4 x 4 in. (14.0 x 9.5 x 10.2 cm)","74.51.2592","4972" "artifact4973.jpg","Overall: 6 5/8 x 7 1/4 x 1 1/4 in. (16.8 x 18.4 x 3.2 cm)","74.51.2594","4973" "artifact4974.jpg","H. 5 1/2 x 6 3/4 x 11 3/4 in. (14 x 17.1 x 29.8 cm)","74.51.2628","4974" "artifact4975.jpg","Overall: 10 1/8 7 7 in. (25.7 cm)","74.51.2633","4975" "artifact4976.jpg","Overall: 7 1/2 5 11 1/2 in. (19.1 12.7 29.2 cm)","74.51.2637","4976" "artifact4977.jpg","Overall: 10 1/2 7 8 1/2 in. (26.7 17.8 21.6 cm)","74.51.2819","4977" "artifact4978.jpg","H. 6 3/8 x 14 1/2 x 5 1/4 in. (16.2 x 36.8 x 13.3 cm)","74.51.2831","4978"

I understand that the KeyError signifies that 'artifact4' does not exist, but I don't know why - I am taking data from a large CSV file with almost 6,000 records. Any suggestions greatly appreciated!

Upvotes: 2

Views: 2983

Answers (2)

abarnert
abarnert

Reputation: 365697

You could make this a lot simpler by using csv.DictReader instead of using csv.reader and then trying to generate a dict out of each row:

>>> s='''"artifact4971.jpg","H. 17 1/2 x 16 1/2 x 5 1/2 in. (44.5 x 41.9 x 14 cm)","74.51.2648","4971"
... "artifact4972.jpg","Overall: 5 1/2 x 3 3/4 x 4 in. (14.0 x 9.5 x 10.2 cm)","74.51.2592","4972"
... "artifact4973.jpg","Overall: 6 5/8 x 7 1/4 x 1 1/4 in. (16.8 x 18.4 x 3.2 cm)","74.51.2594","4973"'''
>>> reader = csv.DictReader(s.splitlines(), 
...                         ('artifact1', 'artifact2', 'artifact3', 'artifact4'))
>>> list(reader)
[{'artifact1': 'artifact4971.jpg',
  'artifact2': 'H. 17 1/2 x 16 1/2 x 5 1/2 in. (44.5 x 41.9 x 14 cm)',
  'artifact3': '74.51.2648',
  'artifact4': '4971'},
 {'artifact1': 'artifact4972.jpg',
  'artifact2': 'Overall: 5 1/2 x 3 3/4 x 4 in. (14.0 x 9.5 x 10.2 cm)',
  'artifact3': '74.51.2592',
  'artifact4': '4972'},
 {'artifact1': 'artifact4973.jpg',
  'artifact2': 'Overall: 6 5/8 x 7 1/4 x 1 1/4 in. (16.8 x 18.4 x 3.2 cm)',
  'artifact3': '74.51.2594',
  'artifact4': '4973'}]

If you really want to build each row dict yourself, it's harder to get wrong if you use a dict comprehension.

The declarative structure strongly encourages you to think about this properly. If you know about enumerate you'll probably write something like this:

 art_temp_dict={'artifact'+str(i+1): artifact
                for i, artifact in enumerate(artifact_group)}

… and if not, something like this—uglier, but still correct:

 art_temp_dict={'artifact'+str(i+1): artifact_group[i]
                for i in len(artifact_group)}

… rather than trying to recover the index by searching.

Upvotes: 3

Mark Ransom
Mark Ransom

Reputation: 308130

If you ever have a situation where the fourth column of the CSV has the same value as one of the earlier columns, the index will produce the earlier match and artifact4 will never be populated. Use this instead:

 for i, artifact in enumerate(artifact_group):
     art_temp_dict["artifact"+str(i+1)] = artifact

Upvotes: 3

Related Questions