Reputation: 35
I am currently am trying to create a nested dictionary from a csv file.
The CSV File represents how many people there are for each demographic region. In the nested dictionary each key is a region and the value is another dictionary. The inner dictionary uses the demographic as key and the number of people for its value.
Region,American,Asian,Black
midwest,2500,2300,2150
north,1200,2300,2300
south,1211,211,2100
Currently have:
def load_csv(filename):
data={}
with open(filename) as csvfile:
fh = csv.DictReader(csvfile)
for row in fh:
key = row.pop('Region')
data[key] = row
return data
Expected Output (must convert the numbers from strings to integers):
{'west':{'American': 2500, 'ASIAN': 2300, ...}, 'north':{'American': 1200, ..}...}
I'm getting stuck when running my code as it is giving me "KeyError: 'Region'"
Upvotes: 1
Views: 103
Reputation: 4710
This solution requires no imports at all, but will only work if there are no escaped separators [i.e., none of the values contain a ,
or newline]:
def load_csv(filename, sep=','):
with open(filename, 'r') as csvfile:
csvlines = csvfile.read().strip().splitlines()
csvRows = [[v.strip() for v in l.split(sep)] for l in csvlines]
if not csvRows: return {}
keys = csvRows[0][1:]
return {r[0]: dict(zip(keys, r[1:])) for r in csvRows[1:] if r}
At the very least, it works for the csv in your snippet, but if your csv contains ,
or newlines at any unexpected positions, this function will no longer be reliable - it would definitely be better to use a module built for reading and parsing csv.
Upvotes: 0
Reputation: 120409
Use a comprehension to convert string values to integers:
import csv
def load_csv(filename):
data = {}
with open(filename) as csvfile:
# Your file has 3 invisible characters at the beginning, skip them
csvfile.seek(3)
fh = csv.DictReader(csvfile)
for row in fh:
key = row.pop('Region')
data[key] = {k: int(v) for k, v in row.items()} # <- HERE
return data
data = load_csv('data.csv')
Output:
>>> data
{'midwest': {'American': 2500, 'Asian': 2300, 'Black': 2150},
'north': {'American': 1200, 'Asian': 2300, 'Black': 2300},
'south': {'American': 1211, 'Asian': 211, 'Black': 2100}}
Bonus: The same operation with Pandas:
import pandas as pd
data = pd.read_csv('data.csv', index_col='Region').T.to_dict()
print(data)
# Output
{'midwest': {'American': 2500, 'Asian': 2300, 'Black': 2150},
'north': {'American': 1200, 'Asian': 2300, 'Black': 2300},
'south': {'American': 1211, 'Asian': 211, 'Black': 2100}}
Upvotes: 1