drbunsen
drbunsen

Reputation: 10679

Creating a dictionary from a csv file?

I am trying to create a dictionary from a csv file. The first column of the csv file contains unique keys and the second column contains values. Each row of the csv file represents a unique key, value pair within the dictionary. I tried to use the csv.DictReader and csv.DictWriter classes, but I could only figure out how to generate a new dictionary for each row. I want one dictionary. Here is the code I am trying to use:

import csv

with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
    writer = csv.writer(outfile)
    for rows in reader:
        k = rows[0]
        v = rows[1]
        mydict = {k:v for k, v in rows}
    print(mydict)

When I run the above code I get a ValueError: too many values to unpack (expected 2). How do I create one dictionary from a csv file? Thanks.

Upvotes: 240

Views: 778366

Answers (19)

Tony Shouse
Tony Shouse

Reputation: 136

A simple function that take as filename and returns an OrderedDict:

def csv_to_dict(filename):
    import csv
    with open(filename, mode='r') as infile:
        reader = csv.DictReader(infile)
        data = [row for row in reader]
    return data

Upvotes: 1

Rob Coelli
Rob Coelli

Reputation: 1

Probably not the most efficient method but this is my preferred method as it's pretty flexible because you can select which column of data will be the keys and which will be the values.

import pandas as pd

df = pd.read_csv('coors.csv')

mydict = dict(zip(df['col1'],df['col2']))

replacing 'col1' and 'col2' with the column you want the keys to be and col2 with the column you want the values to be

Upvotes: 0

Ion Harin
Ion Harin

Reputation: 29

here is an approach for CSV to Dict:

import pandas

data = pandas.read_csv('coors.csv')

the_dictionary_name = {row.k: row.v for (index, row) in data.iterrows()}

Upvotes: 0

conmak
conmak

Reputation: 1460

Assuming you have a CSV of this structure:

"a","b"
1,2
3,4
5,6

And you want the output to be:

[{'a': '1', ' "b"': '2'}, {'a': '3', ' "b"': '4'}, {'a': '5', ' "b"': '6'}]

A zip function (not yet mentioned) is simple and quite helpful.

def read_csv(filename):
    with open(filename) as f:
        file_data=csv.reader(f)
        headers=next(file_data)
        return [dict(zip(headers,i)) for i in file_data]

If you prefer pandas, it can also do this quite nicely:

import pandas as pd
def read_csv(filename):
    return pd.read_csv(filename).to_dict('records')

Upvotes: 17

Laxmikant Ratnaparkhi
Laxmikant Ratnaparkhi

Reputation: 4995

Open the file by calling open and then using csv.DictReader.

input_file = csv.DictReader(open("coors.csv"))

You may iterate over the rows of the csv file dict reader object by iterating over input_file.

for row in input_file:
    print(row)

OR To access first line only

dictobj = csv.DictReader(open('coors.csv')).next() 

UPDATE In python 3+ versions, this code would change a little:

reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader) 

Upvotes: 150

Canute S
Canute S

Reputation: 394

If you have:

  1. Only 1 key and 1 value as key,value in your csv
  2. Do not want to import other packages
  3. Want to create a dict in one shot

Do this:

mydict = {y[0]: y[1] for y in [x.split(",") for x in open('file.csv').read().split('\n') if x]}

What does it do?

It uses list comprehension to split lines and the last "if x" is used to ignore blank line (usually at the end) which is then unpacked into a dict using dictionary comprehension.

Upvotes: 0

Nate
Nate

Reputation: 12819

I believe the syntax you were looking for is as follows:

import csv

with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
        writer = csv.writer(outfile)
        mydict = {rows[0]:rows[1] for rows in reader}

Alternately, for python <= 2.7.1, you want:

mydict = dict((rows[0],rows[1]) for rows in reader)

Upvotes: 236

TheTechGuy
TheTechGuy

Reputation: 1608

with pandas, it is much easier, for example. assuming you have the following data as CSV and let's call it test.txt / test.csv (you know CSV is a sort of text file )

a,b,c,d
1,2,3,4
5,6,7,8

now using pandas

import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()

for each row, it would be

df.to_dict(orient='records')

and that's it.

Upvotes: 4

fabda01
fabda01

Reputation: 3753

For simple csv files, such as the following

id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3

You can convert it to a Python dictionary using only built-ins

with open(csv_file) as f:
    csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]

(_, *header), *data = csv_list
csv_dict = {}
for row in data:
    key, *values = row   
    csv_dict[key] = {key: value for key, value in zip(header, values)}

This should yield the following dictionary

{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
 'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
 'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
 'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}

Note: Python dictionaries have unique keys, so if your csv file has duplicate ids you should append each row to a list.

for row in data:
    key, *values = row

    if key not in csv_dict:
            csv_dict[key] = []

    csv_dict[key].append({key: value for key, value in zip(header, values)})

Upvotes: 9

Alejandro Villegas
Alejandro Villegas

Reputation: 400

Many solutions have been posted and I'd like to contribute with mine, which works for a different number of columns in the CSV file. It creates a dictionary with one key per column, and the value for each key is a list with the elements in such column.

    input_file = csv.DictReader(open(path_to_csv_file))
    csv_dict = {elem: [] for elem in input_file.fieldnames}
    for row in input_file:
        for key in csv_dict.keys():
            csv_dict[key].append(row[key])

Upvotes: 2

Paulo Henrique Zen
Paulo Henrique Zen

Reputation: 679

Try to use a defaultdict and DictReader.

import csv
from collections import defaultdict
my_dict = defaultdict(list)

with open('filename.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for line in csv_reader:
        for key, value in line.items():
            my_dict[key].append(value)

It returns:

{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}

Upvotes: 1

mudassirkhan19
mudassirkhan19

Reputation: 682

This isn't elegant but a one line solution using pandas.

import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()

If you want to specify dtype for your index (it can't be specified in read_csv if you use the index_col argument because of a bug):

import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()

Upvotes: 51

Trideep Rath
Trideep Rath

Reputation: 3703

One-liner solution

import pandas as pd

dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}

Upvotes: 11

hamed
hamed

Reputation: 1383

You can use this, it is pretty cool:

import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
      records, metadata = commas.parse(f)
      for row in records:
            print 'this is row in dictionary:'+rowenter code here

Upvotes: 2

cloudyBlues
cloudyBlues

Reputation: 75

If you are OK with using the numpy package, then you can do something like the following:

import numpy as np

lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
   my_dict[lines[i][0]] = lines[i][1]

Upvotes: 3

Thiru
Thiru

Reputation: 3363

You can also use numpy for this.

from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }

Upvotes: 14

Alex Laskin
Alex Laskin

Reputation: 1127

You have to just convert csv.reader to dict:

~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3

~ >> cat > d.py
import csv
with open('1.csv') as f:
    d = dict(filter(None, csv.reader(f)))

print(d)

~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}

Upvotes: 23

John La Rooy
John La Rooy

Reputation: 304117

I'd suggest adding if rows in case there is an empty line at the end of the file

import csv
with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
        writer = csv.writer(outfile)
        mydict = dict(row[:2] for row in reader if row)

Upvotes: 5

robert
robert

Reputation: 34398

import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
   k, v = row
   d[k] = v

Upvotes: 76

Related Questions