Reputation: 10679
I am trying to create a dictionary from a csv file. The first column of the csv file contains unique keys and the second column contains values. Each row of the csv file represents a unique key, value pair within the dictionary. I tried to use the csv.DictReader
and csv.DictWriter
classes, but I could only figure out how to generate a new dictionary for each row. I want one dictionary. Here is the code I am trying to use:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
for rows in reader:
k = rows[0]
v = rows[1]
mydict = {k:v for k, v in rows}
print(mydict)
When I run the above code I get a ValueError: too many values to unpack (expected 2)
. How do I create one dictionary from a csv file? Thanks.
Upvotes: 240
Views: 778366
Reputation: 136
A simple function that take as filename and returns an OrderedDict:
def csv_to_dict(filename):
import csv
with open(filename, mode='r') as infile:
reader = csv.DictReader(infile)
data = [row for row in reader]
return data
Upvotes: 1
Reputation: 1
Probably not the most efficient method but this is my preferred method as it's pretty flexible because you can select which column of data will be the keys and which will be the values.
import pandas as pd
df = pd.read_csv('coors.csv')
mydict = dict(zip(df['col1'],df['col2']))
replacing 'col1' and 'col2' with the column you want the keys to be and col2 with the column you want the values to be
Upvotes: 0
Reputation: 29
here is an approach for CSV to Dict:
import pandas
data = pandas.read_csv('coors.csv')
the_dictionary_name = {row.k: row.v for (index, row) in data.iterrows()}
Upvotes: 0
Reputation: 1460
Assuming you have a CSV of this structure:
"a","b"
1,2
3,4
5,6
And you want the output to be:
[{'a': '1', ' "b"': '2'}, {'a': '3', ' "b"': '4'}, {'a': '5', ' "b"': '6'}]
A zip function (not yet mentioned) is simple and quite helpful.
def read_csv(filename):
with open(filename) as f:
file_data=csv.reader(f)
headers=next(file_data)
return [dict(zip(headers,i)) for i in file_data]
If you prefer pandas, it can also do this quite nicely:
import pandas as pd
def read_csv(filename):
return pd.read_csv(filename).to_dict('records')
Upvotes: 17
Reputation: 4995
Open the file by calling open and then using csv.DictReader.
input_file = csv.DictReader(open("coors.csv"))
You may iterate over the rows of the csv file dict reader object by iterating over input_file.
for row in input_file:
print(row)
OR To access first line only
dictobj = csv.DictReader(open('coors.csv')).next()
UPDATE In python 3+ versions, this code would change a little:
reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader)
Upvotes: 150
Reputation: 394
If you have:
Do this:
mydict = {y[0]: y[1] for y in [x.split(",") for x in open('file.csv').read().split('\n') if x]}
It uses list comprehension to split lines and the last "if x" is used to ignore blank line (usually at the end) which is then unpacked into a dict using dictionary comprehension.
Upvotes: 0
Reputation: 12819
I believe the syntax you were looking for is as follows:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = {rows[0]:rows[1] for rows in reader}
Alternately, for python <= 2.7.1, you want:
mydict = dict((rows[0],rows[1]) for rows in reader)
Upvotes: 236
Reputation: 1608
with pandas, it is much easier, for example.
assuming you have the following data as CSV and let's call it test.txt
/ test.csv
(you know CSV is a sort of text file )
a,b,c,d
1,2,3,4
5,6,7,8
now using pandas
import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()
for each row, it would be
df.to_dict(orient='records')
and that's it.
Upvotes: 4
Reputation: 3753
For simple csv files, such as the following
id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3
You can convert it to a Python dictionary using only built-ins
with open(csv_file) as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
csv_dict = {}
for row in data:
key, *values = row
csv_dict[key] = {key: value for key, value in zip(header, values)}
This should yield the following dictionary
{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}
Note: Python dictionaries have unique keys, so if your csv file has duplicate ids
you should append each row to a list.
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})
Upvotes: 9
Reputation: 400
Many solutions have been posted and I'd like to contribute with mine, which works for a different number of columns in the CSV file. It creates a dictionary with one key per column, and the value for each key is a list with the elements in such column.
input_file = csv.DictReader(open(path_to_csv_file))
csv_dict = {elem: [] for elem in input_file.fieldnames}
for row in input_file:
for key in csv_dict.keys():
csv_dict[key].append(row[key])
Upvotes: 2
Reputation: 679
Try to use a defaultdict
and DictReader
.
import csv
from collections import defaultdict
my_dict = defaultdict(list)
with open('filename.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
for line in csv_reader:
for key, value in line.items():
my_dict[key].append(value)
It returns:
{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}
Upvotes: 1
Reputation: 682
This isn't elegant but a one line solution using pandas.
import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()
If you want to specify dtype for your index (it can't be specified in read_csv if you use the index_col argument because of a bug):
import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()
Upvotes: 51
Reputation: 3703
One-liner solution
import pandas as pd
dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}
Upvotes: 11
Reputation: 1383
You can use this, it is pretty cool:
import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
records, metadata = commas.parse(f)
for row in records:
print 'this is row in dictionary:'+rowenter code here
Upvotes: 2
Reputation: 75
If you are OK with using the numpy package, then you can do something like the following:
import numpy as np
lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
my_dict[lines[i][0]] = lines[i][1]
Upvotes: 3
Reputation: 3363
You can also use numpy for this.
from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }
Upvotes: 14
Reputation: 1127
You have to just convert csv.reader to dict:
~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3
~ >> cat > d.py
import csv
with open('1.csv') as f:
d = dict(filter(None, csv.reader(f)))
print(d)
~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}
Upvotes: 23
Reputation: 304117
I'd suggest adding if rows
in case there is an empty line at the end of the file
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = dict(row[:2] for row in reader if row)
Upvotes: 5
Reputation: 34398
import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
k, v = row
d[k] = v
Upvotes: 76