pattymills
pattymills

Reputation: 137

Trying to delete specific items from a list within a list within a dictionary. (Python2)

My data structures are as follows:

def cross(A, B):
    return [a+b for a in A for b in B]

digits = '123456789'
rows = 'ABCDEFGHI'
cols = digits
squares = cross(rows, cols)
unitlist = ([cross(rows, c) for c in cols] +
            [cross(r, cols) for r in rows] +
            [cross(rs, cs) for rs in ('ABC', 'DEF', 'GHI') for cs in ('123', '456', '789')])
units = dict((s, [u for u in unitlist if s in u]) for s in squares)
peers = dict((s, set(sum(units[s],[]))-set([s])) for s in squares)

This causes units to output

{'A1': [['A1','A2','A3','A4','A5','A6','A7','A8','A9'], 
        ['A1','B1','C1','D1','E1','F1','G1','H1','I1'], 
        ['A1','A2','A3','B1','B2','B3','C1','C2','C3']], 
 'A2': [['A1','A2','A3','A4','A5','A6','A7','A8','A9'],       
        ['A2','B2','C2','D2','E2','F2','G2','H2','I2'], 
        ['A1','A2','A3','B1','B2','B3','C1','C2','C3']],
 'A3': [[etc.]]} 

I want to be able to create a copy of units, but to not have the KEY be a string inside one of its own lists. So I want 'A1' removed from the first 3 lists. 'A2' removed from the next 3 lists, etc.

I can import copy and do units2 = copy.deepcopy(units) then units2['A1'][0].remove('A1') to do it one list at a time. So that led me to try making a loop to do it all at once.

for s in squares: #assign s to 'A1' then 'A2', etc.
    for x in range(3): #
        units2[s][x].remove(s)

I would think that this would do it flawlessly by running

units2['A1'][0].remove('A1')
units2['A1'][1].remove('A1')
units2['A1'][2].remove('A1')
units2['A2'][0].remove('A2')
# etc.

Unfortunately, after running this loop, I end up with:

units2 = {'B8': [[], [], []], 
          'H1': [[], [], []],
          'C7': [[], [], []],
          'B3': [[], [], []],
          # etc.
          }

So somehow, this loop is deleting all of the data inside of the lists instead of just the current [s] in the current [x] list.

I have tried structuring the list in different ways and even made 81 versions of

for x in range(3):
    units2['A1'][x].remove('A1')
for x in range(3):
    units2['A2'][x].remove('A2')
for x in range(3):
    units2['A3'][x].remove('A3')

But I still end up with empty lists as all of my dictionary values.

*What I'm trying to do is build a sudoku solver to then use to generate sudokus for a game in PyGame. I plan on using the lists in units2 to check and see if only 1 cell in any given list is still eligible for a specific number. If that is the case, I know that that cell must have that number as its value because it is the only one in the column, row, or 9x9 block that can legally hold it based on the rules.

Upvotes: 0

Views: 57

Answers (2)

Wayne Werner
Wayne Werner

Reputation: 51807

If you look at what is happening here and step back for a moment, the answer becomes obvious.

You are no doubt aware what causes this:

In [1]: a = [1,2,3]
In [2]: b = a   
In [3]: a.remove(2)
In [4]: print(a)
[1, 3]

In [5]: print(b)
[1, 3]

That's why you have copy.deepcopy. But when you look at what's happening, if you use Occam's Razor, the obvious answer is that you have multiple references to the same list, and when you delete from another list that ends out emptying all of the lists in your structure. Here's an easy way to check that:

import collections
import pprint

counter = collections.Counter()
for s in squares:
    for x in range(3):
        counter[id(units2[s][x])] += 1

pprint.pprint(counter)

Which outputs (unsurprisingly):

Counter({140502086261512: 9,
         140502086288328: 9,
         140502086261256: 9,
         140502086287688: 9,
         140502086288008: 9,
         # and so on...
         })

Obviously your ids will be different, but the counts will all be 9.

You also could've used pdb or pudb to inspect the variables as you looped through.

If you replace units2 with units you'll get the same counts. So the real question here is why doesn't copy.deepcopy behave the way you thought it did? And why do you have multiple copies of the same list in your list of lists?

Let's try a few experiments:

list_of_lists = [[1,1,1],
                 [2,2,2],
                 [3,3,3],
                 ]
deepcopy_of_lists = copy.deepcopy(list_of_lists)
print(id(list_of_lists))
print(id(deepcopy_of_lists))

I would expect two different id's here, and indeed they are. But what about their contents?

print(id(list_of_lists[0]))
print(id(deepcopy_of_lists[0]))

Yep. Those are different. What if we put these in a dictionary?

dict_of_list_of_lists = {'list_of_lists': list_of_lists}
deepcopy_of_dict = copy.deepcopy(dict_of_list_of_lists)
print(id(dict_of_list_of_lists['list_of_lists'][0]))
print(id(deepcopy_of_dict['list_of_lists'][0]))

We still get different ids. Everything seems to be behaving as expected so obviously we're missing something. Oh, yes - we're comparing between our copied lists and our original lists.

If you replace units2 with units you'll get the same counts.

There is a hint. Also, if you read the deepcopy documentation:

Two problems often exist with deep copy operations that don’t exist with shallow copy operations:

  • Recursive objects (compound objects that, directly or indirectly, contain a reference to themselves) may cause a recursive loop.
  • Because deep copy copies everything it may copy too much, e.g., administrative data structures that should be shared even between copies.

The deepcopy() function avoids these problems by:

  • keeping a “memo” dictionary of objects already copied during the current copying pass; and
  • letting user-defined classes override the copying operation or the set of components copied.

deepcopy pays attention to objects that it has already seen.

mylist = [1,2,3]
list_of_mylist = [mylist, mylist, mylist]
deepcopy_of_mylist = copy.deepcopy(list_of_mylist)

print(id(deepcopy_of_mylist[0]))
print(id(deepcopy_of_mylist[1]))

Voilá! We have our culprit!

In your original dictionary units you have multiple copies to the same lists in different keys. An that's because of this line:

units = dict((s, [u for u in unitlist if s in u]) for s in squares)

This section in particular:

[u for u in unitlist if s in u]

You are repeating references to lists from unitlist throughout your dictionary. Luckily this is a trivial fix:

[u[:] for u in unitlist if s in u]

Will make a copy of that list each time, and the rest of your code will work just fine.

import pprint
import copy
import pprint

def cross(A, B):
        return [a+b for a in A for b in B]

digits = '123456789'
rows = 'ABCDEFGHI'
cols = digits
squares = cross(rows, cols)
unitlist = ([cross(rows, c) for c in cols] +
                        [cross(r, cols) for r in rows] +
                        [cross(rs, cs) for rs in ('ABC', 'DEF', 'GHI') for cs in ('123', '456', '789')])

units = dict((s, [u[:] for u in unitlist if s in u]) for s in squares)
peers = dict((s, set(sum(units[s],[]))-set([s])) for s in squares)

units2 = copy.deepcopy(units)
for s in squares: #assign s to 'A1' then 'A2', etc.
    for x in range(3): #
        units2[s][x].remove(s)

pprint.pprint(units2)

Outputs:

{'A1': [['B1', 'C1', 'D1', 'E1', 'F1', 'G1', 'H1', 'I1'],
        ['A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9'],
        ['A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3']],
 'A2': [['B2', 'C2', 'D2', 'E2', 'F2', 'G2', 'H2', 'I2'],
        ['A1', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9'],
        ['A1', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3']],
 'A3': [['B3', 'C3', 'D3', 'E3', 'F3', 'G3', 'H3', 'I3'],
        ['A1', 'A2', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9'],
        ['A1', 'A2', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3']],
 # and so on and so forth
 }

Upvotes: 1

James Buck
James Buck

Reputation: 1640

Using nested list comprehension change units to be:

units = dict((s, [[e for e in u if s != e] 
                        for u in unitlist if s in u]) for s in squares)

Upvotes: 1

Related Questions