Reputation: 6709
I have this method here that generates a directed graph in the form of a dictionary where the values of a key are the nodes that the key points to, ie, {'stack': ['over','flow']}, stack points to over and flow...
def generateGraph(fileName):
heroDict = {}
graph = {}
with open(fileName) as inFile:
for line in inFile:#go through each line
name_comic = line.rstrip().replace('"', '').split('\t') #split into list with name and comic book as strings
if name_comic[1] in heroDict: #if the comic book is already in the dictionary
heroDict[name_comic[1]] += [name_comic[0]] #add the hero into the comic's list of heroes
else:
heroDict.update({name_comic[1]: [name_comic[0]]}) # update dictionary with name and comic book
for i in heroDict.values():
for j in i:
if graph.has_key(j):
tempDict = copy.deepcopy(i)
tempDict.remove(j)
heroList = tempDict
graph[j] += heroList
else:
tempDict = copy.deepcopy(i)
tempDict.remove(j)
heroList = tempDict
graph[j] = heroList
print graph #<========== the graph has duplicates, ie, values that are the same as their keys are present
return graph
My question is, how can I implement the use of sets with dictionaries to prevent values that are the same as the key in question to be added to the key?
Upvotes: 1
Views: 383
Reputation: 1121484
Here's how I'd recode your graph builder; using the csv
module and collections.defaultdict
class make the code vastly more readable:
import csv
from collections import defaultdict
def generateGraph(fileName):
heroDict = defaultdict(list)
with open(fileName, 'rb') as inFile:
reader = csv.reader(inFile, delimiter='\t')
for row in reader:
name, comic = row[:2]
heroDict[comic].append(name)
graph = defaultdict(list)
for names in heroDict.itervalues():
for name in names:
graph[name].extend(n for n in names if n != name)
print graph
return graph
There is no need to use sets here. Note that I used more meaningful variable names; try to avoid i
and j
unless they are integer indices.
Upvotes: 4