Reputation: 335
In python 3.4, I am trying to populate a dictionary in a large loop to allocate 30000 * 1000 double numbers to it. I would like to allocate memory for the dictionary beforehand so that I can reduce the performance overhead caused by allocating memory in each iteration.
Also, how to check the limit of memory size that is allowed to allocate to a dictionary (and list) in python? For example, if it only allows 50MB, I will try to avoid overflow. This may depends on operating systems and others but I would like to have an idea about how to maximize performance.
I can use
ll = [None] * 1000
to allocate memory for a list.
Is there a similar way to do for a dictionary ?
d = {None} * 1000 ?
or
d = {None: None} * 1000 ?
thanks
Upvotes: 5
Views: 8495
Reputation: 2251
You don't need to "allocate memory" for your python objects.
You can use .append
to dynamically grow the list.
Preallocating memory makes sense if you know the data type that will be put into the list, in which case I would look at numpy.
If you know the keys, you can use dictionary = {a:None for a in range (100)}
,
but you are probably better of with a collections.defaultdict
Upvotes: 2
Reputation: 95652
Pre-allocating the list ensures that the allocated index values will work. I assume that's what you mean by preallocating a dict. In that case:
d = dict.fromkeys(range(1000))
or use any other sequence of keys you have handy. If you want to preallocate a value other than None you can do that too:
d = dict.fromkeys(range(1000), 0)
Edit as you've edited your question to clarify that you meant to preallocate the memory, then the answer to that question is no, you cannot preallocate the memory, nor would it be useful to do that. Most of the memory used isn't the dictionary itself, it will be the objects used as keys and values. The dictionary itself allocates memory in a way that is effectively constant time (so it starts off small but then resizes in larger chunks in such a way that the overall time is effectively constant).
Allocating 30 million objects to a dictionary will require approximately 120MB or 240MB for the dict itself but the individual objects will require a lot more so unless you have a lot of RAM in your system I would think it will be the content of the dictionary that gives you a problem rather than the dictionary itself.
If you fire up the interactive prompt you'll find that it only takes a few seconds to run this:
>>> d = dict.fromkeys(range(30000000))
>>> import sys
>>> sys.getsizeof(d)
1610613016
So 1,610,613,016 bytes (1.5GB) for a dictionary that contains only integer keys and all the values are None
. Store unique values as well and you've double the size if they're just integers as well but if they're strings or complex objects your memory consumption will be very high.
Upvotes: 5
Reputation: 531125
The immediate question appears to be: is *
defined for a dict
operand? The answer is no.
>>> {None:None} * 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for *: 'dict' and 'int'
You can create a dictionary with a given number of keys with
d = dict((i,None) for i in range(1000))
or more idiomatically for Python 2.7 or 3.x
d = {i: None for i in range(1000)}
This creates a dictionary with 1000 unique integer keys (which, as an aside, is semantically equivalent to your original list example).
Upvotes: 1
Reputation: 31494
The problem here is that you should know the keys. For example you could do:
d = {i: None for i in range(1000)}
but you can only do that if you know that the keys are 0 ... 999.
Upvotes: 2