fractalgreg
fractalgreg

Reputation: 105

"For" loop structure in python

As I was debugging a small bit of code, I noticed something unexpected:

The for loop that cycles through the filename to remove the numbers, by looking at each character of the string and replacing it, seems to take a print of the filename as it exists in the first pass of the loop and cycles through those letters, so that if, as I do in the code, make changes to the string passed to the loop, python still looks for those letters that were in the string to begin with.

Have I just uncovered (for myself) a fundamental feature of the for loop, or is this just something weird that resulted from my code?

short_list = ['1787cairo.jpg', '237398rochester.jpg']
print short_list
for entry in short_list:
    entry_pos = short_list.index(entry)
    for char in entry:
        print entry, char, ord(char)
        if ord(char) in range (48,58):
            entry = entry.replace(char,'')
        print entry
    short_list[entry_pos] = entry              
print short_list

Upvotes: 0

Views: 247

Answers (2)

Hugh Bothwell
Hugh Bothwell

Reputation: 56674

Try instead

from string import digits

def remove_chars(s, bad_chars):
    """
    Return `s` with any chars in `bad_chars` removed
    """
    bad_chars = set(bad_chars)
    return "".join(ch for ch in s if ch not in bad_chars)

short_list = ['1787cairo.jpg', '237398rochester.jpg']
short_list = [remove_chars(entry, digits) for entry in short_list]

which gives

['cairo.jpg', 'rochester.jpg']

Upvotes: 0

Daniel Roseman
Daniel Roseman

Reputation: 599778

The point here is that Python variables are really just names that point at objects. When you do for char in entry the for loop is iterating over whatever entry happens to point to; if you then reassign entry to point to something else, the iterator won't know that.

Note that if entry happened to be a mutable object, like a list, and you mutated the items in that object, the values seen by the iterator would change; again, this is because the iterator is pointing to the object itself.

Really though your code is over-complicated; rather than keeping indexes and replacing items in the list, you should be building up new lists with the changed items:

new_list = []
for entry in short_list:
    new_entry = ''
    for char in entry:
        if ord(char) not in range (48,58):
            new_entry += char
    new_list.append(new_entry)

and this can be shortened further to a nested list comprehension:

[''.join(char for char in entry if ord(char) not in range (48,58)) for entry in short_list]

(and, as a further improvement, your check of ord(char) can be replaced by char.isdigit().)

Upvotes: 7

Related Questions