Kolom
Kolom

Reputation: 297

set vs list performance difference

I was solving this question from project Euler, in python 3.9. One thing caught my attention was using a set performs better than using a list.

enter image description here

Chart above shows time vs number of items for set and list. List time increase exponentially, while sets perform on relatively constant performance.

My question is why list has poor performance when number of items increase, while sets perform on quite constant performance?

Here is my code:

from time import perf_counter
import matplotlib.pyplot as plt

def using_set(limit):
    seq = set()
    for i in range(2,limit):
        for j in range(2,limit):
            seq.add(i**j)
    #print(len(seq))

def using_list(limit):
    l = []
    for a in range(2,limit):
        for b in range (2,limit):
            c = a**b
            if c not in l:
                l.append(c)
    #print(len(l))

yset = []
ylist = []
xline = []
for i in range(1, 10_00, 10):
    start = perf_counter()
    using_set(i)
    end = perf_counter()
    yset.append(end - start)
    xline.append(i)

    start = perf_counter()
    using_list(i)
    end = perf_counter()
    ylist.append(end-start)
plt.plot(xline, yset, label="set")
plt.plot(xline, ylist, label="list")
plt.xlabel("number of items")
plt.ylabel("time (seconds)")
plt.title("Set vs List time performance")
plt.legend()
plt.show()

Upvotes: 0

Views: 1820

Answers (1)

Jiří Baum
Jiří Baum

Reputation: 6930

The reference for this would be the Python wiki "Time Complexity" page

In particular, adding an item to a set has complexity O(1) on average, meaning constant time regardless of the size of the set; meawhile, list x in l has complexity O(n), meaning it grows linearly with the size of the list.

Upvotes: 3

Related Questions