rambalachandran
rambalachandran

Reputation: 2201

python read csv file with row and column headers into dictionary with two keys

I have csv file of the following format,

,col1,col2,col3
row1,23,42,77
row2,25,39,87
row3,48,67,53
row4,14,48,66

I need to read this into a dictionary of two keys such that

dict1['row1']['col2'] = 42
dict1['row4']['col3'] = 66

If I try to use csv.DictReader with default options

with open(filePath, "rb" ) as theFile:
    reader = csv.DictReader(theFile, delimiter=',')
    for line in reader:
    print line

I get the following output

{'': 'row1', 'col2': '42', 'col3': '77', 'col1': '23'}
{'': 'row2', 'col2': '39', 'col3': '87', 'col1': '25'}
{'': 'row3', 'col2': '67', 'col3': '53', 'col1': '48'}
{'': 'row4', 'col2': '48', 'col3': '66', 'col1': '14'}

I'm not sure of how to process this output to create the type of dictionary that I'm interested in.

For sake of completeness, it would also help if you can address how to write back the dictionary into a csv file with the above format

Upvotes: 13

Views: 33933

Answers (3)

Tim Pietzcker
Tim Pietzcker

Reputation: 336148

Using the CSV module:

import csv
dict1 = {}

with open("test.csv", "rb") as infile:
    reader = csv.reader(infile)
    headers = next(reader)[1:]
    for row in reader:
        dict1[row[0]] = {key: int(value) for key, value in zip(headers, row[1:])}

Upvotes: 29

Romain
Romain

Reputation: 21888

You can use pandas for that even if it is a bit an overkill. The pro is that there is almost nothing to code to obtain the expected result.

# Reading the file
df = pd.read_csv('tmp.csv', index_col=0)

# Creating the dict
d = df.transpose().to_dict(orient='series')

print(d['row1']['col2'])
42

Upvotes: 7

alecxe
alecxe

Reputation: 473863

The format of the input file is not exactly convenient to parse with csv module. I'd parse headers separately, then parse the rest line by line, splitting by ,, stripping and making dictionaries along the way. The working code:

from pprint import pprint

d = {}
with open("myfile.csv") as f:
    headers = [header.strip() for header in next(f).split(",")[1:]]

    for line in f:
        values = [value.strip() for value in line.split(",")]
        d[values[0]] = dict(zip(headers, values[1:]))

pprint(d)

Prints:

{'row1': {'col1': '23', 'col2': '42', 'col3': '77'},
 'row2': {'col1': '25', 'col2': '39', 'col3': '87'},
 'row3': {'col1': '48', 'col2': '67', 'col3': '53'},
 'row4': {'col1': '14', 'col2': '48', 'col3': '66'}}

Upvotes: 3

Related Questions