atkfan402
atkfan402

Reputation: 29

Is it faster to create a copy of a nested dictionary or access it directly when passing into a function in Python?

I'm passing a dictionary, one element of which is also a dictionary, to a function. I want to access the nested dictionary many times within the function, and so I was wondering whether it would be faster to create a copy of the nested dictionary as a local variable once at the beginning, and access it directly from that point forward, or to access it by way of the outer dictionary every time.

Upvotes: 1

Views: 313

Answers (2)

quamrana
quamrana

Reputation: 39374

If you have this code:

d = {'nested':{}, 1:[], 2:[], 3:[]}

def f(d):
    n = d['nested']
    for k,v in n.items():
        ...

Then calling f(d) is cheap, since the reference to d is copied as a parameter, but only costs the size of an int.

The line n = d['nested'] costs a little to look-up the key. This cost is best paid once at the beginning of a function rather than paid each time n is accessed.

Upvotes: 2

Jolly Roger
Jolly Roger

Reputation: 1134

I see that you have already accepted the answer. But see this also. Sometimes it is good to see statistical values to confirm any intuition that you may have. I wrote a little script that manipulates a nested dictionary. Intuition is correct, it takes less time when you have a reference to a nested dictionary. Here is the plot, read the details below to see how I got this. You can see keeping a reference takes less time With _noRef I am not keeping a reference to a nested dictionary. With _ref I am keeping a reference to nested dictionary. I am then running a for loop and adding members to nested dictionary in add and accessing them in count.

I am timing both with reference and without reference. Further to be sure of the pattern, I am repeating and getting multiple time values.

This is the script

import timeit
import matplotlib.pyplot as plt

def add_noRef(d):
    for i in range(10000000):
        d['nested'][i] = 100*i

def add_Ref(d):
    r = d['nested']
    for i in range(10000000):
        r[i] = 100*i

def count_noRef(d):
    count = 0
    for i in range(10000000):
        count += d['nested'][i]
    return count

def count_Ref(d):
    r = d['nested']
    count = 0
    for i in range(10000000):
        count += r[i]
    return count 

def TEST_NO_REF(repeats):

    SETUP_CODE = '''
def add_noRef(d):
    for i in range(1000):
        d['nested'][i] = 100*i

def count_noRef(d):
    count = 0
    for i in range(1000):
        count += d['nested'][i]
    return count'''
    TEST_CODE = '''
d = dict()
d['nested'] = dict()
add_noRef(d)
x = count_noRef(d)
    '''
    # timeit.repeat statement
    times = timeit.repeat(setup = SETUP_CODE,
                          stmt = TEST_CODE,
                          repeat = repeats,
                          number = 10000)
    return times

def TEST_REF(repeats):

    SETUP_CODE = '''
def add_Ref(d):
    r = d['nested']
    for i in range(1000):
        r[i] = 100*i

def count_Ref(d):
    count = 0
    r = d['nested']
    for i in range(1000):
        count += d['nested'][i]
    return count'''
    TEST_CODE = '''
d = dict()
d['nested'] = dict()
add_Ref(d)
x = count_Ref(d)
    '''
    # timeit.repeat statement
    times = timeit.repeat(setup = SETUP_CODE,
                          stmt = TEST_CODE,
                          repeat = repeats,
                          number = 10000)
    return times
repeats = 10
X = [i for i in range(1,repeats+1)]
time_Norefs = TEST_NO_REF(repeats)
time_refs = TEST_REF(repeats)

plt.plot(X, time_Norefs)

plt.plot(X, time_refs)

plt.legend(["Without using reference to nested dict", "Using reference to nested dict"])

plt.xlabel('Iteration')
plt.ylabel('time taken')
plt.title('Time taken to execute')
plt.show()
plt.savefig('timeTest.png')

Upvotes: 2

Related Questions