Reputation: 3173
I have a csv file with following records.
language,1,english1
language,3,english3
language,4,english4
language,5,english5
language,6,english6
language,7,english7
gender,F,F
gender,female,F
gender,Female,F
gender,M,M
gender,male,M
gender,Male,M
I would like to create dictionaries, namely based on first column, say dictlanguage, dictgender, and I want to create key, value pairs respectively.
What i am looking for:
dictlanguage = [{'3': 'english3', '4': 'english4', '5': 'english5', '6': 'english6', '7': 'english7'}]
dictgender = [{'F': 'F', 'female': 'F', 'Female': 'F', 'M': 'M', 'male': 'M', 'Male': 'M'}]
The above will help me use appropriate dictionaries, and get key/values. The original dataset is huge and so i would like to have seperate dictionaries. I have tried the following code, but I get one single dictionary,can someone help me please.
I am blocked in creating dynamic variable name for dictionary based on column1, and also to get multiple dictionaries, with clean/simple code.
import csv
reader = csv.reader(open('c:\\sample.csv', newline='', encoding='utf8'))
# result = {}
for row in reader:
# print(row)
d2 = [{rows[1]: rows[2] for rows in reader}]
print(d2)
This prints the following output:
[{'3': 'english3', '4': 'english4', '5': 'english5', '6': 'english6', '7': 'english7', 'F': 'F', 'female': 'F', 'Female': 'F', 'M': 'M', 'male': 'M', 'Male': 'M'}]
I would like to accomplish without pandas, if possible due to project limitations. Appreciate any help on this.
Upvotes: 0
Views: 174
Reputation: 51683
You can use your first column to create the appropriate inner dictionary as value under this name, then fill the inner with values.
Write demo file:
with open('fn.csv', 'w') as f:
f.write("""language,1,english1
language,3,english3
language,4,english4
language,5,english5
language,6,english6
language,7,english7
gender,F,F
gender,female,F
gender,Female,F
gender,M,M
gender,male,M
gender,Male,M""")
It is easy to do it using a defaultdict like so:
import csv
from collections import defaultdict
dd = defaultdict(dict)
with open('fn.csv') as f:
reader = csv.reader(f)
for d, key, val in reader:
dd[d][key] = val
print(*dd.values(), sep="\n\n")
Output:
{'1': 'english1', '3': 'english3', '4': 'english4', '5': 'english5',
'6': 'english6', '7': 'english7'}
{'F': 'F', 'female': 'F', 'Female': 'F', 'M': 'M', 'male': 'M', 'Male': 'M'}
The main advantage is that is is robust against adding more things to your file - you do not need to know how many inner dicts exist.
Without defaultdict you can replace
dd[d][key] = val
with (a somewhat slower)
inner = dd.setdefault(d,{})
inner[key] = val
to create
{'language': {'1': 'english1', '3': 'english3', '4': 'english4',
'5': 'english5', '6': 'english6', '7': 'english7'},
'gender': {'F': 'F', 'female': 'F', 'Female': 'F', 'M': 'M',
'male': 'M', 'Male': 'M'}}
Upvotes: 0
Reputation: 3379
As described, without using any library, one approach could be the following:
# d are your data a 2-d list
s = set([x[0] for x in d])
res = {k:dict([x[1:] for x in d if x[0] == k]) for k in s}
The resulting dictionary of dictionaries is keyed using the distinct names which appear as first elements of the bi dimensional list so you can obtain your distinct dicts simply using:
language_dict = res["language"]
gender_dict = res["gender"]
Upvotes: 0
Reputation: 909
You could to like this:
import csv
def split_data(reader: csv.reader) -> dict:
dicts = {}
for row in reader:
name = f"dict{row[0]}"
if name in dicts.keys():
dicts[name][row[1]] = row[2]
else:
dicts[name] = {row[1]: row[2]}
return dicts
reader = csv.reader(open('data.csv', newline='', encoding='utf8'))
data = split_data(reader)
# output
{'dictlanguage': {'1': 'english1',
'3': 'english3',
'4': 'english4',
'5': 'english5',
'6': 'english6',
'7': 'english7'},
'dictgender': {'F': 'F',
'female': 'F',
'Female': 'F',
'M': 'M',
'male': 'M',
'Male': 'M'}}
Upvotes: 1
Reputation: 27515
You could use if statements to determine the dictionary to edit. Also I would suggest using the with
keyword so the file closes when finished:
import csv
dict_language = {}
dict_gender = {}
with open('filename.csv') as f:
reader = csv.reader(f)
for d, key, val in reader:
if d == 'language':
dict_language[key] = val
elif d == 'gender':
dict_gender[key] = val
print(dict_language)
print(dict_gender)
{'1': 'english1', '3': 'english3', '4': 'english4', '5': 'english5', '6': 'english6', '7': 'english7'}
{'F': 'F', 'female': 'F', 'Female': 'F', 'M': 'M', 'male': 'M', 'Male': 'M'}
Upvotes: 2