Reputation: 1644
I have a data file in the following format:
9, 12, 16, ABC, a12d
8, 09, 24, ADP, v154a
6, 07, 16, ADP, l28a
2, 14, 15, CDE, d123p
I need to build a dictionary of sets in the following format:
ABC : ([a12d])
ADP : ([v154a, l128a])
CDE : ([d123p])
I can build a set of any of the columns eg:
with open('data.csv','r') as r:
name = set([line.strip().split(',')[3] for line in r])
I figure there must be a way to make every element in the set into a dictionary key and its adjacent value add to a set? There is an added complication that some of the keys have a multiple values (for example lines 2 and 3 above) but they are separated with into separate lines.
Thanks in advance for any help
Upvotes: 1
Views: 1215
Reputation: 1
Here is the below code to read column values and then convert them to the dictionary in python
cat dictionary.txt (This txt has info about Name Age Birthyear)
Luffy 20 2000
Nami 18 2002
Chopper 10
##################### code is here #######
#!/usr/bin/python3.7.4
d = {}
with open("dictionary.txt") as f:
for line in f:
line=line.split()
d.setdefault(line[0],[]).append(line[1])
if len(line)==3:
d.setdefault(line[0],[]).append(line[2])
else:
d.setdefault(line[0],[]).append('NULL')
print(d)
Output: {'Luffy': ['20', '2000'], 'Nami': ['18', '2002'], 'Chopper': ['10', 'NULL']}
Upvotes: 0
Reputation: 21883
If you don't mind using pandas:
import pandas as pd
df = pd.read_csv("data.csv", header=None, usecols=[3,4], index_col=0, skipinitialspace=1, names=["key", "value"])
Which can be read as read data.csv
, which contain no header
, use only columns 3
and 4
, and use column 0
(formerly 3
) as the index. Skip the initial space
in the values, and name
the column you read (3
and 4
) key
and value
. This will give you:
df
value
key
ABC a12d
ADP v154a
ADP l28a
CDE d123p
So you can access any value with .loc
:
df.loc["ABC"].values
array(['a12d'], dtype=object)
df.loc["ADP"].values
array([['v154a'],
['l28a']], dtype=object)
For the latter, you can flatten the array with ravel()
:
df.loc["ADP"].values.ravel()
array(['v154a', 'l28a'], dtype=object)
So it's not really a dictionary, but it behaves a bit like it, and you can do much more with this kind of object (a pandas Dataframe
). Plus you can easily read and write csv files.
If you don't know pandas, have a look :
Upvotes: 1
Reputation: 19677
from collections import defaultdict
d = defaultdict(set)
with open('data.csv','r') as r:
for line in r:
splitted = line.strip().split(',')
name = splitted[3].strip()
value = splitted[4].strip()
d[name].add(value)
Upvotes: 2