Reputation: 53
I am trying to sort my crime totals by zipcode and victim count by offense type. I have built out the dictionary by report number. Here is my output of a small sample of data when I print the dictionary:
{'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}
The dictionary is built as follows: {Report_number: [Zipcode, offense type, number of victims]}
I'm brand new to coding and am just learning dictionaries. How would I go about sorting through the dictionary to organize my data into this format?
Zip Codes Crime totals
====================
64126 809
64127 3983
64128 1749
64129 1037
64130 4718
64131 2080
64132 2060
64133 2005
64134 2928
Any help would be much appreciated. Below is my code so far. I'm accessing two files with about 50,000 rows of data, so efficiency is very important.
from collections import Counter
incidents_f = open('incidents.csv', mode = "r")
crime_dict = dict()
for line in incidents_f:
line_1st = line.strip().split(",")
if line_1st[0].upper() != "REPORT_NO":
report_no = line_1st[0]
offense = line_1st[3]
zip_code = line_1st[4]
if len(zip_code) < 5:
zip_code = "99999"
if report_no in crime_dict:
crime_dict[report_no].append(zip_code).append(offense)
else:
crime_dict[report_no] = [zip_code]+[offense]
#close File
incidents_f.close
details_f = open('details.csv',mode = 'r')
for line in details_f:
line_1st = line.strip().split(",")
if line_1st[0].upper() != "REPORT_NO":
report_no = line_1st[0]
involvement = line_1st[1]
if involvement.upper() == 'VIC':
victims = "VIC"
if report_no in crime_dict:
crime_dict[report_no].append(victims)
else:
continue
#close File
details_f.close
print(crime_dict)
Upvotes: 0
Views: 125
Reputation: 1033
This is a way to do it with more code than @Alexander's solution:
crime_dict ={
'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'],
'20003319': ['64130', '13', 'VIC'],
'60077156': ['64130', '18', 'VIC'],
'100057708': ['99999', '17', 'VIC', 'VIC'],
'40024161': ['64108', '17', 'VIC', 'VIC']
}
crimes_by_zip = {}
for k, v in crime_dict.items():
zip = v[0]
if zip not in crimes_by_zip.keys():
crimes_by_zip[zip] = 0
crimes_by_zip[zip] += 1
for zip in sorted(crimes_by_zip.keys()):
print(zip, crimes_by_zip[zip])
64108 1
64130 3
99999 1
Upvotes: 1
Reputation: 76
D = {'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}
data_with_zip_duplicate = [(D[key][0],key) for key in sorted(D.keys(), key = lambda x:D[x][0] )]
print(*data_with_zip_duplicate, sep = "\n")
Upvotes: 0