Kharrid
Kharrid

Reputation: 71

Remove repeating spaces manually in a string

I have a question about a certain piece of code. I was doing an exercise in python about strings. I had come up with the correct logic but, for some reason, the output inside the for loop is not returning correctly. Instead, the global value gets returned. I am not too familiar with Python, but is there some way to fix this?

def song_decoder(song):
    global Ret

    Ret = ""

    Ret = song.replace("WUB", " ")
    Ret = Ret.strip()
    Ret += "1"

    space = False
    for i in range(0, len(Ret)):
        if Ret[i] == "1":
            Ret = Ret[:i]
            break
        elif Ret[i] == " ":
            if space is False:
                space = True
            else:
                if i+1 == len(Ret):
                    Ret = Ret[:i]
                else:
                    Ret = Ret[:i] + Ret[(i+1):]
        else:
            space = False

    return Ret

Test code:

def test_song_decoder(self):
    self.assertEquals(song_decoder("AWUBBWUBC"), "A B C","WUB should be replaced by 1 space")
    self.assertEquals(song_decoder("AWUBWUBWUBBWUBWUBWUBC"), "A B C","multiples WUB should be replaced by only 1 space")
    self.assertEquals(song_decoder("WUBAWUBBWUBCWUB"), "A B C","heading or trailing spaces should be removed")

The second test fails, and 'A B C' is returned instead.

Upvotes: 0

Views: 242

Answers (2)

tzaman
tzaman

Reputation: 47790

You're going through way too much trouble trying to collapse multiple spaces into one:

def song_decoder(song, delimiter="WUB"):
    splits = song.split(delimiter)  # instead of replacing, just split on your delimiter
    cleaned = filter(None, splits)  # remove empty elements caused by consecutive WUBs
    return ' '.join(cleaned)        # join them up with a single space in between

Upvotes: 2

poke
poke

Reputation: 387687

First of all, there is no need for you to make Ret global here. So better remove that line.

Second, there is one test missing which will give you another hint:

>>> song_decoder("AWUBBWUBC")
'A B C'
>>> song_decoder("AWUBWUBBWUBWUBC")
'A B C'
>>> song_decoder("AWUBWUBWUBBWUBWUBWUBC")
'A  B  C'

As you can see, two WUBs are correctly replaced by only one space. The problem appears when there are three. This should give you a hint that the space detection doesn’t work correctly after you have made a replacement. The reason for this is actually rather simple:

# you iterate over the *initial* length of Ret
for i in range(0, len(Ret)):
    # ...
    elif Ret[i] == " ":
        if space is False:
            space = True
        else:
            # when you hit a space and you have seen a space directly
            # before then you check the next index…
            if i+1 == len(Ret):
                Ret = Ret[:i]
            else:
                # … and remove the next index from the string
                Ret = Ret[:i] + Ret[(i+1):]

    # now at the end of the loop, `i` is incremented to `i + 1`
    # although you have already removed the character at index `i`
    # making the next character you would have to check belong to
    # index `i` too

So the result is that you skip over the character that comes directly after the second space (which you remove). So it’s impossible to detect three spaces this way because you always skip the third one.

In general, it’s a very bad idea to iterate over something which you modify while doing that. In your case, you are iterating over the length of the string but the string actually gets shorter all the time. So you really should avoid doing that.

Instead of iterating over the Ret string, you should iterate over the original string which you keep constant:

def song_decoder(song):
    # replace the WUBs and strip spaces
    song = song.replace("WUB", " ").strip() 
    ret = ''

    space = False
    # instead of iterating over the length, just iterate over
    # the characters of the string
    for c in song:
        # since we iterate over the string, we don’t need to check
        # where it ends

        # check for spaces
        if c == " ":
            # space is a boolean, so don’t compare it against booleans
            if not space:
                space = True
            else:
                # if we saw a space before and this character is a space
                # we can just skip it
                continue
        else:
            space = False

        # if we got here, we didn’t skip a later space, so we should
        # include the current character
        ret += c

    return ret

Upvotes: 8

Related Questions