coltonpagefault
coltonpagefault

Reputation: 45

Global variable messes up my recursive function

I've just run into a tricky issue. The following code is supposed to split words into chunks of length numOfChar. The function calls itself, which makes it impossible to have the resulting list (res) inside the function. But if I keep it outside as a global variable, then every subsequent call of the function with different input values leads to a wrong result because res doesn't get cleared.

Can anyone help me out?

Here's the code (in case you are interested, this is problem 7-23 from PySchools.com):

res = []

def splitWord(word, numOfChar):        
    if len(word) > 0:
        res.append(word[:numOfChar])
        splitWord(word[numOfChar:], numOfChar)    
    return res

print splitWord('google', 2)
print splitWord('google', 3)
print splitWord('apple', 1)
print splitWord('apple', 4)

Upvotes: 3

Views: 1754

Answers (3)

Helgi
Helgi

Reputation: 5436

A pure recursive function should not modify the global state, this counts as a side effect.

Instead of appending-and-recursion, try this:

def splitWord(word, numOfChar): 
    if len(word) > 0:
        return [word[:numOfChar]] + splitWord(word[numOfChar:], numOfChar)
    else:
        return []

Here, you chop the word into pieces one piece at a time, on every call while going down, and then rebuild the pieces into a list while going up.

This is a common pattern called tail recursion.

P.S. As @e-satis notes, recursion is not an efficient way to do this in Python. See also @e-satis's answer for a more elaborate example of tail recursion, along with a more Pythonic way to solve the problem using generators.

Upvotes: 5

Bite code
Bite code

Reputation: 596683

To elaborate on @Helgi answer, here is a more performant recursive implémentation. It updates the list instead of summing two lists (which results in the creation of a new object every time).

This pattern forces you to pass a list object as third parameter.

def split_word(word, num_of_chars, tail):

    if len(word) > 0:
        tail.append(word[:num_of_chars])
        return split_word(word[num_of_chars:], num_of_chars, tail)

    return tail

res = split_word('fdjskqmfjqdsklmfjm', 3, [])

Another advantage of this form, is that it allows tail recursion optimisation. It's useless in Python because it's not a language that performs such optimisation, but if you translate this code into Erlang or Lisp, you will get it for free.

Remember, in Python you are limited by the recursion stack, and there is no way out of it. This is why recursion is not the preferred method.

You would most likely use generators, using yield and itertools (a module to manipulate generators). Here is a very good example of a function that can split any iterable in chunks:

from itertools import chain, islice

def chunk(seq, chunksize, process=iter):
    it = iter(seq)
    while True:
        yield process(chain([it.next()], islice(it, chunksize - 1)))

Now it's a bit complicated if you start learning Python, so I'm not expecting you to fully get it now, but it's good that you can see this and know it exists. You'll come back to it later (we all did, Python iteration tools are overwhelming at first).

The benefits of this approach are:

  • It can chunk ANY iterable, not just strings, but also lists, dictionaries, tuples, streams, files, sets, queryset, you name it...
  • It accepts iterables of any length, and even one with an unknown length (think bytes stream here).
  • It eats very few memory, as the best thing with generators is that they generate the values on the fly, one by one, and they don't store the previous results before computing the next.
  • It returns chunks of any nature, meaning you can have a chunks of x letters, lists of x items, or even generators spitting out x items (which is the default).
  • It returns a generator, and therefor can be use in a flow of other generators. Piping data from one generator to the other, bash style, is a wonderful Python ability.

To get the same result that with your function, you would do:

In [17]: list(chunk('fdjskqmfjqdsklmfjm', 3, ''.join))
Out[17]: ['fdj', 'skq', 'mfj', 'qds', 'klm', 'fjm']

Upvotes: 2

NPE
NPE

Reputation: 500247

Recursion is completely unnecessary here:

def splitWord(word, numOfChar):
   return [word[i:i+numOfChar] for i in xrange(0, len(word), numOfChar)]

If you insist on a recursive solution, it is a good idea to avoid global variables (they make it really tricky to reason about what's going on). Here is one way to do it:

def splitWord(word, numOfChar):
   if len(word) > 0:
      return [word[:numOfChar]] + splitWord(word[numOfChar:], numOfChar)
   else:
      return []

Upvotes: 2

Related Questions