user374372
user374372

Reputation: 190

Why are dictionary values being overriden at the end of this loop?

I have a dictionary of public transportation stops called stops. I want to duplicate the ones that are transfers (have more than one line) so that there is a duplicate stop in stops for each of those additional lines. I initially store those duplicates in a dictionary called duplicates. However, after I assign the name of the appropriate line to each duplicate stop, they all gets overriden by the last line in the original stop's original list of lines. So I end up with a bunch of duplicate stops all with the same line instead of one stop for each line. What is overriding these values? The file l_stops.csv is on Dropbox and bpaste.

import csv
import random

def stop_coords():
    with open('l_stops.csv', 'rb') as csvfile:
        stop_reader = csv.reader(csvfile, delimiter=',', quotechar='"')
        stops = {}
        for row in stop_reader:
            map_id = row[5]
            lines = set()
            if row[7] == 'true':
                lines.add('Red')
            if row[8] == 'true':
                lines.add('Blue')
            if row[9] == 'true':
                lines.add('Green')
            if row[10] == 'true':
                lines.add('Brown')
            if row[11] == 'true':
                lines.add('Purple')
            if row[13] == 'true':
                lines.add('Yellow')
            if row[14] == 'true':
                lines.add('Pink')
            if row[15] == 'true':
                lines.add('Orange')
            if map_id not in stops:
                stop_name = row[2].partition('(')[0].rstrip(' ')
                lat = float(row[16].lstrip('"(').rpartition(',')[0])
                lng = float(row[16].lstrip('"(').rpartition(',')[2].strip(' )"'))
                stop = {}
                stop['name'] = stop_name
                stop['lat'] = lat
                stop['lng'] = lng
                stop['x'] = lng
                stop['y'] = lat
                stop['lines'] = lines
                stops[map_id] = stop
                stop['duplicateStops'] = []
            elif stops[map_id]['lines'] != lines:
                stops[map_id]['lines'] = stops[map_id]['lines'].union(lines)
        for item in stops:
            stops[item]['lines'] = list(stops[item]['lines'])

        # Add duplicate stops for stops that are transfers (shared by multiple lines)
        duplicates = {} # the dictionary that will hold the duplicates and be added to the stops dictionary after all duplicate stops have been processed
        for item in stops:
            num_lines = len(stops[item]['lines'])
            if num_lines > 1: # if a stop has more than one line
                original_lines = stops[item]['lines']
                stops[item]['lines'] = original_lines[0]
                equivalent_map_ids = [item] # Make a list of different map_ids that represent the same stop (but on different lines). The first map_id in the list will be the "original" one.
                for i in range(num_lines - 1): # for each line after the first one
                    # Create a new map_id and make sure it doesn't conflict with an existing map_id
                    while True:
                        new_map_id = str(random.randint(10000, 99999))
                        if new_map_id not in stops and new_map_id not in duplicates:
                            break
                    duplicates[new_map_id] = stops[item] # duplicate the stop
                    equivalent_map_ids.append(new_map_id) # add the new map_id to the list of equivalent map_ids
                # Set the duplicateStops value of everyone in equivalent_map_ids's to the other stops' map_ids
                # The first map_id in equivalent_map_ids is the original one that's in the stops dictionary, so set its duplicateStops value to the rest of the list
                stops[item]['duplicateStops'] = equivalent_map_ids[1:]

                # For the rest of the map_ids in equivalent_map_ids
                j = 1
                for duplicate_stop in stops[item]['duplicateStops']:
                    duplicates[duplicate_stop]['lines'] = original_lines[j]
                    duplicates[duplicate_stop]['duplicateStops'] = equivalent_map_ids[:j] + equivalent_map_ids[(j + 1):]  # this line also changes stops[item]['duplicateStops'], not sure how
                    j+= 1
                # somehow by this point all duplicates have the same line (the last line in the original 'lines' list)
                for stop in stops[item]['duplicateStops']:
                    print duplicates[stop]['name']
                    print duplicates[stop]['lines']

        for item in duplicates:
            print item
            print duplicates[item]['name']
            print duplicates[item]['lines']
        stops.update(duplicates)
        stops['none'] = {'name' : 'none', 'lat' : 0, 'lng' : 0, 'x' : 0, 'y' : 0, 'lines' : ['none']}

While debugging, I discovered that reassigning duplicates[duplicate_stop]['duplicateStops'] also reassigns the stops[item]['duplicateStops']. How is that possible? duplicates and stops are two separate dictionaries.

Upvotes: 0

Views: 93

Answers (1)

user2864740
user2864740

Reputation: 61965

Then duplicates[duplicate_stop] and stops[item] both name the same object - and mutating the object, well, changes the object. Objects are not automatically copied/cloned/duplicated on an assignment or when used as function arguments.

The problematic line is most likely

duplicates[new_map_id] = stops[item] # duplicate the stop

.. and the comment is wrong because there is no duplication that occurs.


The question Understanding dict.copy() - shallow or deep? may be useful; at the very least it shows how to make a real copy.

Upvotes: 2

Related Questions