Reputation: 29
I'm passing a dictionary, one element of which is also a dictionary, to a function. I want to access the nested dictionary many times within the function, and so I was wondering whether it would be faster to create a copy of the nested dictionary as a local variable once at the beginning, and access it directly from that point forward, or to access it by way of the outer dictionary every time.
Upvotes: 1
Views: 313
Reputation: 39374
If you have this code:
d = {'nested':{}, 1:[], 2:[], 3:[]}
def f(d):
n = d['nested']
for k,v in n.items():
...
Then calling f(d)
is cheap, since the reference to d
is copied as a parameter, but only costs the size of an int
.
The line n = d['nested']
costs a little to look-up the key. This cost is best paid once at the beginning of a function rather than paid each time n
is accessed.
Upvotes: 2
Reputation: 1134
I see that you have already accepted the answer. But see this also.
Sometimes it is good to see statistical values to confirm any intuition that you may have. I wrote a little script that manipulates a nested dictionary. Intuition is correct, it takes less time when you have a reference to a nested dictionary. Here is the plot, read the details below to see how I got this.
With _noRef
I am not keeping a reference to a nested dictionary. With _ref
I am keeping a reference to nested dictionary. I am then running a for loop and adding members to nested dictionary in add
and accessing them in count
.
I am timing both with reference and without reference. Further to be sure of the pattern, I am repeating and getting multiple time values.
This is the script
import timeit
import matplotlib.pyplot as plt
def add_noRef(d):
for i in range(10000000):
d['nested'][i] = 100*i
def add_Ref(d):
r = d['nested']
for i in range(10000000):
r[i] = 100*i
def count_noRef(d):
count = 0
for i in range(10000000):
count += d['nested'][i]
return count
def count_Ref(d):
r = d['nested']
count = 0
for i in range(10000000):
count += r[i]
return count
def TEST_NO_REF(repeats):
SETUP_CODE = '''
def add_noRef(d):
for i in range(1000):
d['nested'][i] = 100*i
def count_noRef(d):
count = 0
for i in range(1000):
count += d['nested'][i]
return count'''
TEST_CODE = '''
d = dict()
d['nested'] = dict()
add_noRef(d)
x = count_noRef(d)
'''
# timeit.repeat statement
times = timeit.repeat(setup = SETUP_CODE,
stmt = TEST_CODE,
repeat = repeats,
number = 10000)
return times
def TEST_REF(repeats):
SETUP_CODE = '''
def add_Ref(d):
r = d['nested']
for i in range(1000):
r[i] = 100*i
def count_Ref(d):
count = 0
r = d['nested']
for i in range(1000):
count += d['nested'][i]
return count'''
TEST_CODE = '''
d = dict()
d['nested'] = dict()
add_Ref(d)
x = count_Ref(d)
'''
# timeit.repeat statement
times = timeit.repeat(setup = SETUP_CODE,
stmt = TEST_CODE,
repeat = repeats,
number = 10000)
return times
repeats = 10
X = [i for i in range(1,repeats+1)]
time_Norefs = TEST_NO_REF(repeats)
time_refs = TEST_REF(repeats)
plt.plot(X, time_Norefs)
plt.plot(X, time_refs)
plt.legend(["Without using reference to nested dict", "Using reference to nested dict"])
plt.xlabel('Iteration')
plt.ylabel('time taken')
plt.title('Time taken to execute')
plt.show()
plt.savefig('timeTest.png')
Upvotes: 2