Reputation: 88
I'm trying to understand how sharing memory between processes works and I'm stuck.
I'm using a very simple test program c.py and tracking memory using smem
c.py:
import sys
import time
from multiprocessing import Process
arr = [x for x in range(int(1e6) * 50)]
print(sys.getsizeof(arr)) # 411943896
def f():
x = 0
for i in range(len(arr)):
#x += arr[i]
pass
time.sleep(10)
p = Process(target=f)
p.start()
p.join()
When I run it with x += arr[i]
commented out I see the following results:
PID User Command Swap USS PSS RSS
1693779 1000 python /usr/bin/smem -n -t 0 8368 9103 14628
1693763 1000 python c.py 0 1248 992816 1986688
1693749 1000 python c.py 0 1244 993247 1989752
-------------------------------------------------------------------------------
3 1 0 10860 1995166 3991068
If I understand correctly PSS is telling me that my single global array arr
is shared between two processes and USS shows very little unique memory allocated per process.
However when I uncomment x += arr[i]
just accessing the array elements in child process yields very different results:
PID User Command Swap USS PSS RSS
1695338 1000 python /usr/bin/smem -n -t 0 8476 9508 14392
1695296 1000 python c.py 64 1588472 1786582 1986708
1695280 1000 python c.py 0 1588644 1787246 1989520
-------------------------------------------------------------------------------
3 1 64 3185592 3583336 3990620
Which I don't understand. It seems that accessing the array caused it to be copied to the child process, meaning that python actually copies shared memory on access, not on write.
Is my understanding correct? Has memory where arr
data resides been copied to the child process when global variable arr
was accessed?
If so is there no way for the child process to access the global variables without doubling memory usage?
I would love if someone could explain the overall memory usage smem reports, in this case, however, I expect it to be a question more suited for SU?. If simple copying took place I would expect the memory to double however each process shows unique memory of 1588472 and on top of that overall PSS shared memory is 2x 1786582 so it totals at about 6750108? I'm pretty sure my understanding here is very wrong but I don't know how to interpret it.
Upvotes: 3
Views: 1412
Reputation: 281013
You are writing to the elements. The standard implementation of Python uses reference counting, so even looking at an object requires a write to its reference count.
Upvotes: 3