Ak2012
Ak2012

Reputation: 35

Python: Trying to create nested dictionary from CSV File, but getting KeyError

I am currently am trying to create a nested dictionary from a csv file.

The CSV File represents how many people there are for each demographic region. In the nested dictionary each key is a region and the value is another dictionary. The inner dictionary uses the demographic as key and the number of people for its value.

Region,American,Asian,Black
midwest,2500,2300,2150
north,1200,2300,2300
south,1211,211,2100

Currently have:

def load_csv(filename):
data={}
    with open(filename) as csvfile:
        fh = csv.DictReader(csvfile)
        for row in fh:
            key = row.pop('Region')
            data[key] = row
        return data

Expected Output (must convert the numbers from strings to integers):

{'west':{'American': 2500, 'ASIAN': 2300, ...}, 'north':{'American': 1200, ..}...}

I'm getting stuck when running my code as it is giving me "KeyError: 'Region'"

Upvotes: 1

Views: 103

Answers (2)

Driftr95
Driftr95

Reputation: 4710

This solution requires no imports at all, but will only work if there are no escaped separators [i.e., none of the values contain a , or newline]:

def load_csv(filename, sep=','):
    with open(filename, 'r') as csvfile:
        csvlines = csvfile.read().strip().splitlines()
    csvRows = [[v.strip() for v in l.split(sep)] for l in csvlines]
    if not csvRows: return {}
    keys = csvRows[0][1:]
    return {r[0]: dict(zip(keys, r[1:])) for r in csvRows[1:] if r} 

At the very least, it works for the csv in your snippet, but if your csv contains , or newlines at any unexpected positions, this function will no longer be reliable - it would definitely be better to use a module built for reading and parsing csv.

Upvotes: 0

Corralien
Corralien

Reputation: 120409

Use a comprehension to convert string values to integers:

import csv

def load_csv(filename):
    data = {}
    with open(filename) as csvfile:
        # Your file has 3 invisible characters at the beginning, skip them
        csvfile.seek(3)
        fh = csv.DictReader(csvfile)
        for row in fh:
            key = row.pop('Region')
            data[key] = {k: int(v) for k, v in row.items()}  # <- HERE
        return data

data = load_csv('data.csv')

Output:

>>> data
{'midwest': {'American': 2500, 'Asian': 2300, 'Black': 2150},
 'north': {'American': 1200, 'Asian': 2300, 'Black': 2300},
 'south': {'American': 1211, 'Asian': 211, 'Black': 2100}}

Bonus: The same operation with Pandas:

import pandas as pd

data = pd.read_csv('data.csv', index_col='Region').T.to_dict()
print(data)

# Output
{'midwest': {'American': 2500, 'Asian': 2300, 'Black': 2150},
 'north': {'American': 1200, 'Asian': 2300, 'Black': 2300},
 'south': {'American': 1211, 'Asian': 211, 'Black': 2100}}

Upvotes: 1

Related Questions