quakes
quakes

Reputation: 461

Time complexity of creating nested lists in Python

I have run into an oddity with creating nested lists in Python 2.6.6.

Consider the following two functions:

def lists(n):
    start_time = time.time()
    lists = [None]*n
    for i in xrange(n):
            lists[i] = [None]*n
            for j in xrange(n):
                    lists[i][j] = []
    print time.time() - start_time

def simple_lists(n):
    start_time = time.time()
    lists = [None]*n
    for i in xrange(n):
            lists[i] = [None]*n
            for j in xrange(n):
                    lists[i][j] = False
    print time.time() - start_time

They both allocate an array of size n*n. One assigns all values as "False", and one assigns all values as empty lists. They should both run in O(n^2) as far as I can see. However, this does not seem the case... Observe the following test runs:

>>> for i in [4000, 8000, 16000]: simple_lists(i)
2.11170578003
8.67467808723
34.0958559513
>>> for i in [1000, 2000, 4000]: lists(i)
1.13742399216
7.39806008339
78.0808939934

As you can see, simple_lists() seems to grow roughly O(n^2), while lists() seems to grow over O(n^3)!

What's going on here? This quirk is completely and utterly ruining the complexity of my code, and I can't explain why it's behaving like this. Does anyone have any ideas?

Edit: List comprehensions seem to cause the same complexity issues.

Redefining lists() like

def lists(n):
    start_time = time.time()
    lists = [[[] for y in xrange(n)] for x in xrange(n)]
    print time.time() - start_time

causes the following results

>>> for i in [1000, 2000, 4000]: lists(i)
0.388785839081
4.45830011368
65.6449248791

... which is still clearly growing at a pace quicker than O(n^2) (and even quicker than O(n^3) - sheesh).

edit2: After some further looking into the problem, it seems it's caused by the garbage collector. After running gc.disable(), this is the result with the original lists() definition:

>>> for i in [1000, 2000, 4000]: lists(i)
...
0.155457019806
0.616811990738
2.38965821266

which is pretty damn neatly O(n^2).

Moral of the story: don't trust the garbage collector!

Upvotes: 3

Views: 1492

Answers (3)

Dean Yang
Dean Yang

Reputation: 101

Generate the first list first and it will do it in O(n) time.

def simple_list(n):
    import time
    start_time = time.process_time()
    k=[[] for y in range(n)]
    lists = [k for x in range(n)]
    end_time=time.process_time()
    print (end_time - start_time)

Upvotes: 0

quakes
quakes

Reputation: 461

Turns out the weird behavior was caused by the garbage collector. Turning it off with gc.disable() yielded the following:

>>> for i in [1000, 2000, 4000]: lists(i)
...
0.155457019806
0.616811990738
2.38965821266

which fits O(n^2) like a glove.

Upvotes: 2

James Moughan
James Moughan

Reputation: 413

On my machine

for i in [1000, 2000, 4000]: lists(i)

gives

0.994000196457
4.31200003624
17.9900000095

which is a nice neat O(n^2). The last one consumes 1GB of memory, and so lists(8000) thrashes the hard drive and causes my machine to misbehave. delnan is probably right and I'd check your machine's free memory and python's memory consumption during the operation.

Upvotes: 2

Related Questions