Python TypeError in traversing a list

Question

I'm teaching myself Python 3.2 and I'm trying to make a program to match a list of names. pList is a multidimensional list with a string at column 0, an integer at column 1, and a boolean at column 2. However, whenever I try and call this function (which only runs if the number of rows in the list is even), I get a TypeError.

Traceback (most recent call last):
 File "C:\Users\METC\Dropbox\assassins.py", line 150, in 
main()
 File "C:\Users\METC\Dropbox\assassins.py", line 11, in main
update(ops, pList)
 File "C:\Users\METC\Dropbox\assassins.py", line 125, in update
b = match(pList)
 File "C:\Users\METC\Dropbox\assassins.py", line 47, in match
q, p = 0
TypeError: 'int' object is not iterable

Any help would be appreciated, but keep in mind I'm a beginner with the language, so be gentle. :) I don't mind if you get too technical though; I have experience with computer science.

def match(pList):
    b = []
    z = len(pList)-1
    for x in range(z):
        b.append([pList[x][0],0])
    for x in range(z):
        isValid = False
        q, p = 0
        while isValid == False:
            q = random.randint(0, z)
            print('q is ' + str(q))
            if q > z:
                isValid = False
            elif q < 0:
                isValid = False
            elif pList[q][1] == True:
                isValid = False
            else:
                isValid = True
        isMatch = False
        while isMatch == False:
            if pList[q][1] == False:
                isValid = False
                while isValid == False:
                    p = random.randint(0,z)
                    print('p is ' + str(p))
                    if p > z:
                        isValid = False
                    elif p < 0:
                        isValid = False
                    elif pList[p][2] == True:
                        isValid = False
                    else:
                        if q == p:
                            isValid = False
                        else:
                            isValid = True
                print('match valid')
                b[q][1] = pList[p][0]
                isMatch = True
    print('')
    return b

Karl Knechtel · Accepted Answer

You've made the logic far, far too complicated, to the point where I'm going to have to make several passes to get it down to size and show you what you're doing wrong.

First, we'll fix the actual reported error, as the others noted. At the same time, we'll apply a simple principle: don't compare to boolean literals. You don't say "if it is true that it is raining, I will need an umbrella". You say "if it is raining, I will need an umbrella". So cut out the extra stuff. if isValid is more clear than if isValid == True, as it highlights exactly what isValid is supposed to mean. I'm also going to take out the debug traces (print statements that are clearly only there to check if the code's doing the right thing; simplify the code first, and then there is less to check).

def match(pList):
    b = []
    z = len(pList)-1
    for x in range(z):
        b.append([pList[x][0],0])
    for x in range(z):
        isValid = False
        q = p = 0
        while not isValid:
            q = random.randint(0, z)
            if q > z:
                isValid = False
            elif q < 0:
                isValid = False
            elif pList[q][1]:
                isValid = False
            else:
                isValid = True
        isMatch = False
        while not isMatch:
            if not pList[q][1]:
                isValid = False
                while not isValid:
                    p = random.randint(0,z)
                    if p > z:
                        isValid = False
                    elif p < 0:
                        isValid = False
                    elif pList[p][2]:
                        isValid = False
                    else:
                        if q == p:
                            isValid = False
                        else:
                            isValid = True
                b[q][1] = pList[p][0]
                isMatch = True
    return b

Next, we're going to simplify our conditional logic. First off, the result returned from random.randint(0, z) cannot be < 0 nor > z, ever, no matter what. That's part of the very point of the function. So there is no point in writing code to handle those cases and in fact is wrong to do so. Writing code to handle something implies that it could actually happen. That's a distraction for the person reading the code, and a lie, at that. It puts extra space between the things that matter (the call to random.randint and the check against the pList value).

We're also going to simplify if/else pairs that simply set another boolean accordingly. For the same reason that you wouldn't write

if x == 1:
    y == 1
elif x == 2:
    y == 2
# ... etc. ad infinitum for every possible integer value of x

you shouldn't do the same with booleans, either. Finally, we can and should use logical and and or to connect boolean conditions.

def match(pList):
    b = []
    z = len(pList)-1
    for x in range(z):
        b.append([pList[x][0],0])
    for x in range(z):
        isValid = False
        q = p = 0
        while not isValid:
            q = random.randint(0, z)
            isValid = not pList[q][1]
        isMatch = False
        while not isMatch:
            if not pList[q][1]:
                isValid = False
                while not isValid:
                    p = random.randint(0,z)
                    isValid = not pList[p][2] and (q != p)
                b[q][1] = pList[p][0]
                isMatch = True
    return b

My next step will be to fix up the list indexing. Indexing into lists is usually not really what you want, and here in fact it introduces a bug. It's evident that you want to iterate over each "row" of pList; but range(z) gives you numbers from 0 to z-1 inclusive, so it's incorrect to subtract 1 from len(pList) in calculating z. For example, if pList has 5 elements, you will calculate z = 4, and produce range(z) = [0, 1, 2, 3]. You will never access pList[4], and b will only have 4 elements.

You are doing two things with z, fundamentally. One is to make a loop run as many times as there are "rows" in pList, and (in the first loop) do something with each "row". To do this

This is very important: range is not magic, and it has no special connection to for-loops. It's just a function that produces a list of numbers. In Python, a for loop gives you the elements directly. All this indexing nonsense is just that - nonsense, that is best left to less capable languages. If you want to do something with each element of a list, then do something with each element of the list, by writing code that loops over each element of the list. Directly. Not over some separate list of indices that you then use to index back into the original. That's going out of your way to make things complicated.

The second thing you do with z is to generate a random number that's a possible index, so that you can index into pList to get a random row. In other words, you just want to choose a random row. So just choose a random row. The random module provides this functionality directly: the function is called random.choice, and it does exactly what it sounds like.

There's one small hitch here: the original code compares p == q, i.e. compares the two randomly chosen list indices for equality. If we're no longer indexing, then we don't have indices to compare. To fix this, we need to understand what the original purpose was: to check whether the new chosen row is actually the old chosen row again. Again, we simplify by checking that directly: we pick the new row instead of a new index, and then see if it is the old one.

We also have the problem that we need to choose a corresponding row from b that corresponds to whichever row in pList we would previously have identified as q. To deal with that, we can select the b row at the same time as the pList row. This is a little tricky: my approach will be to make a list of pairs of rows - in each pair, there will be a row from b and then a row from pList. This doesn't require anything complicated - there is in fact a built-in function that will stitch b and pList together exactly as we want: it's called zip. Anyway, having chosen a row-pair from that list of row-pairs, we just have to unpack the two rows into two variables - using the q, p = ... syntax that you were mistakenly using in the first place, as it turns out. That's what it's for.

With these changes, we can actually get rid of p, q and z entirely. Which is nice, because it's not at all clear what those names were supposed to mean anyway.

def match(pList):
    b = []
    for row in pList:
        b.append([row[0], 0])
    for row in pList:
        isValid = False
        while not isValid:
            first_row, b_row = random.choice(zip(pList, b))
            isValid = not first_row[1]
        isMatch = False
        while not isMatch:
            if not first_row[1]:
                isValid = False
                while not isValid:
                    second_row = random.choice(pList)
                    isValid = not second_row[2] and (first_row is not second_row)
                b_row[1] = second_row[0]
                isMatch = True
    return b

Time for a little more logical cleanup. In the first while loop, we will keep looping until isValid becomes true. That is to say, until not first_row[1] becomes true. In the second while loop, first_row never gets changed, so, since not first_row[1] was true when the loop started, it will remain true the entire time. Therefore, that if-check is completely unnecessary.

Once that's gone, we find that the second while loop is actually also completely useless: it will loop while not isMatch, i.e. until isMatch. What is isMatch? Well, before we start the loop, it's False, and at the end of the loop, it's True. Always. So we know that this code will run exactly once. We enter the loop, go to the end, set isMatch to true, and then exit, since isMatch, having just been set to true, is true. Code that runs exactly once doesn't need a loop; it's just code.

Another thing I'll do here is to clean up the while isValid loops a little bit by converting them to just break out when we're done. break is not evil (and neither is continue). They actually simplify our thinking about the booleans, because we aren't checking against not isValid any more (emphasis on the not); we're just directly comparing to what we would have assigned to isValid. And this means we get rid of the isValid variable, too, which again was a name that doesn't actually tell us very much.

def match(pList):
    b = []
    for row in pList:
        b.append([row[0], 0])
    for row in pList:
        while True:
            first_row, b_row = random.choice(zip(pList, b))
            if not first_row[1]:
                break
        while True:
            second_row = random.choice(pList)
            if not second_row[2] and (first_row is not second_row):
                break
        b_row[1] = second_row[0]
    return b

One last thing: we can construct b much more cleanly. Building up a list by appending elements is a sucker's game. Don't tell Python how to build lists. It knows how. Instead, ask for a list that meets your specifications, with a list comprehension. This is more simply shown than explained (if you need an explanation, please consult Google), so I'll just go ahead and give you one more version:

def match(pList):
    b = [[row[0], 0] for row in pList]
    for row in pList:
        while True:
            first_row, b_row = random.choice(zip(pList, b))
            if not first_row[1]:
                break
        while True:
            second_row = random.choice(pList)
            if not second_row[2] and (first_row is not second_row):
                break
        b_row[1] = second_row[0]
    return b

From here on, it's hard to correct or improve anything without understanding what it is that you're actually doing. After all this work, I still have no idea!

The way you have it set up, you choose a random row, as many times as there are rows - but you could still choose duplicates. Is that what you really want? Or did you want to choose each row once in a random order? What's the significance of doing it in a random order, anyway?

After choosing the first row, you choose a random second row to match. Did you really want one random row per first row? Or did you want to try all possible pairs of rows?

And just what is all of this data, anyway? What does the output b data represent? What exactly is in pList to begin with, and why is it called pList? What are you "matching" with this match function? I honestly can't imagine.

Python TypeError in traversing a list

Answers (2)

Related Questions