Ogen
Ogen

Reputation: 6709

Python: Use sets in conjunction with dictionaries

I have this method here that generates a directed graph in the form of a dictionary where the values of a key are the nodes that the key points to, ie, {'stack': ['over','flow']}, stack points to over and flow...

def generateGraph(fileName):
    heroDict = {}
    graph = {}
    with open(fileName) as inFile:
        for line in inFile:#go through each line
            name_comic = line.rstrip().replace('"', '').split('\t') #split into list with name and comic book as strings
            if name_comic[1] in heroDict: #if the comic book is already in the dictionary
                heroDict[name_comic[1]] += [name_comic[0]] #add the hero into the comic's list of heroes
            else:
                heroDict.update({name_comic[1]: [name_comic[0]]}) # update dictionary with name and comic book
    for i in heroDict.values():
        for j in i:
            if graph.has_key(j):
                tempDict = copy.deepcopy(i)
                tempDict.remove(j)
                heroList = tempDict
                graph[j] += heroList
            else:
                tempDict = copy.deepcopy(i)
                tempDict.remove(j)
                heroList = tempDict
                graph[j] = heroList
        print graph #<========== the graph has duplicates, ie, values that are the same as their keys are present
    return graph

My question is, how can I implement the use of sets with dictionaries to prevent values that are the same as the key in question to be added to the key?

Upvotes: 1

Views: 383

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121484

Here's how I'd recode your graph builder; using the csv module and collections.defaultdict class make the code vastly more readable:

import csv
from collections import defaultdict

def generateGraph(fileName):
    heroDict = defaultdict(list)

    with open(fileName, 'rb') as inFile:
        reader = csv.reader(inFile, delimiter='\t')
        for row in reader:
            name, comic = row[:2]
            heroDict[comic].append(name)

    graph = defaultdict(list)
    for names in heroDict.itervalues():
        for name in names:
            graph[name].extend(n for n in names if n != name)
    print graph
    return graph

There is no need to use sets here. Note that I used more meaningful variable names; try to avoid i and j unless they are integer indices.

Upvotes: 4

Related Questions