Reputation: 81
EDITTED VERSION: I have a cvs file with two columns and 16000 rows. I want to check each cell (Address) with other Addresses to find the unique ones and put them into a separated dictionary (that contains ID and Address as key and value again). My csv file is something like this and I guess it is delimiter-separated values (not sure about this part and how can I check?) this is an example how it looks.
ID Address
111 abcd
112 def
122 ghi
113 gkl
132 mno
123 abc
131 lnoghi
134 mko
135 mnoe
136 dfo
I think I need to make it as a dictionary and then call one key and its value and compare it with the rest, if it was unique then put it into a new list/dic. will it make any problem if the identical/similar elements are repeated more than once? or not? can you please help me with that, and if you have a better way instead of making it as a dictionary I will be happy to know.
thanks
Upvotes: 0
Views: 137
Reputation: 26315
Since their can be multiple same names, and unique ids, you can make a dictionary with names as the keys, and ids as the values. Here is an example function I wrote a while ago:
from collections import defaultdict
def read_file(filename):
# create the dictionary of lists
data = defaultdict(list)
# read the file
with open(filename) as file:
# skip headers
next(file)
# go over each line
for line in file.readlines():
# split lines on whitespace
items = line.split()
ids, name = int(items[0]), items[1]
# append ids with name
data[name].append(ids)
return data
Which creates a dictionary of your data:
>>> print(dict(read_file("yourdata.txt")))
{'mno': [132, 131], 'ghi': [122], 'def': [112], 'gkl': [113], 'abc': [111, 123]}
Then you could simply look up the keys(names) you want to compare for ids.
Upvotes: 1
Reputation: 924
As @RoadRunner suggested you can do the following: considering that you've read your csv into two lists:
ID = [111,112,122,113,132,123,131]
Names = ['abc','def','ghi','mno','abc','mno']
dictionary = {}
for name in Names:
dictionary[name]= []
for i in range(len(Names)):
dictionary[Names[i]].append(ID[i])
print dictionary
Upvotes: 1