Sam
Sam

Reputation: 1050

Python Memory Size

If I have a list of numbers and if I implement the same thing using dictionary, will they both occupy the same memory space?

Eg.

list = [[0,1],[1,0]]
dict = {'0,0' = 0 , '0,1' = 1, '1,0' = 1, '1,1' = 0 }

will the memory size of list and dict be the same? which will take up more space?

Upvotes: 0

Views: 1941

Answers (4)

kevpie
kevpie

Reputation: 26108

In short, they won't be the same. The list should perform better, unless you can have sparsely populated lists that could be implemented in a dict.

As mentioned by several others, getsizeof doesn't sum the contained objects.

Here is a recipe that does some work for you on standard python (3.0) types.

Compute Memory footprint of an object and its contents

Using this recipe on python 3.1 here are some results:

aList = [[x,x] for x in range(1000)]
aListMod10 = [[x%10,x%10] for x in range(1000)]
aTuple = [(x,x) for x in range(1000)]
aDictString = dict(("%s,%s" % (x,x),x) for x in range(1000))
aDictTuple = dict(((x,x),x) for x in range(1000))

print("0", total_size(0))
print("10", total_size(10))
print("100", total_size(100))
print("1000", total_size(1000))

print("[0,1]", total_size([0,1]))
print("(0,1)", total_size((0,1)))

print("aList", total_size(aList))
print("aTuple", total_size(aTuple))
print("aListMod10", total_size(aListMod10))
print("aDictString", total_size(aDictString))
print("aDictTuple", total_size(aDictTuple))

print("[0]'s", total_size([0 for x in range(1000)]))
print("[x%10]'s", total_size([x%10 for x in range(1000)]))
print("[x%100]'s", total_size([x%100 for x in range(1000)]))
print("[x]'s", total_size([x for x in range(1000)]))

Output:

0 12
10 14
100 14
1000 14

[0,1] 70
(0,1) 62

aList 62514
aTuple 54514
aListMod10 48654
aDictString 82274
aDictTuple 74714

[0]'s 4528
[x%10]'s 4654
[x%100]'s 5914
[x]'s 18514

It seems to logically follow that the best performing memory option would be to use 2 lists:

list_x = [0, 1, ...]
list_y = [1, 0, ...]

This may only be worth it if you are running tight on memory and your list is expected to be large. I would guess that the usage pattern would be creating (x,y) tuples all over the place anyway, so it may be that you should really just do:

tuples = [(0, 1), (1, 0), ...]

All things being equal, choose what allows you to write the most readable code.

Upvotes: 1

aaronasterling
aaronasterling

Reputation: 71064

Getting the size of objects in python is tricky. Ostensibly, this can be done with sys.getsizeof but it returns incomplete data.

An empty list has size usage of 32 bytes on my system.

>>> sys.getsizeof([])
32

A list with one element has size of 36 bytes. This does not seem to vary according to the element.

>>> sys.getsizeof([[1, 2]])
36
>>> sys.getsizeof([1])
36

So you would need to know the size of the inner list as well.

>>> sys.getsizeof([1, 2])
40

So your memory usage for a list (assuming the same as my system) should be 32 bytes plus 44 bytes for every internal list. This is because python is storing the overhead involved with keeping a list which costs 32 bytes. Every addition entry is represented as a pointer to that object and costs 4 bytes. So the pointer costs 4 bytes on top of whatever you are storing. In the case of two element lists, this is 40 bytes.

For a dict, it's 136 for an empty dict

>>> sys.getsizeof({})
136

From there, it will quadruple its size as adding members causes it to run out of space and risk frequent hash collisions. Again, you have to include the size of the object being stored and the keys as well.

>>> sys.getsizeof({1: 2})
136

Upvotes: 1

zneak
zneak

Reputation: 138251

There are very good chances that the dictionary will be bigger.

A dictionary and a list are both fairly different data structures. When it boils down to memory and processor instructions, a list is fairly straightforward: all values in it are contiguous, and when you want to access item n, you go at the beginning of the list, move n items forward, and return it. This is easy, because list items are contiguous and have integer keys.

On the other hand, constraints for dictionaries are fairly different. You can't just go to the beginning of the dictionary, move key item forwards and return it, because key might just not be numeric. Besides, keys don't need to be contiguous.

In the case of a dictionary, you need a structure to find the values associated to keys very easily, even though there might be no relationship between them. Therefore, it can't use the same kind of algorithms a list use. And typically, the data structures required by dictionaries are bigger than data structures required for lists.

Coincidentally, both may have the same size, even though it would be kind of surprising. Though, the data representation will be different, no matter what.

Upvotes: 1

sahhhm
sahhhm

Reputation: 5365

If you're using Python 2.6 or higher, you can use sys.getsizeof() to do a test

>>> import sys
>>> aList = [[0,1],[1,0]]
>>> aDict = {'0,0' : 0 , '0,1' : 1, '1,0' : 1, '1,1' : 0 }
>>> sys.getsizeof(aList)
44
>>> sys.getsizeof(aDict)
140

With the example you provided, we see that aDict takes up more space in memory.

Upvotes: 3

Related Questions