RajSanpui
RajSanpui

Reputation: 12064

Confusing in fork( ) system call

I have created a parent and a child process using fork( ), and both share a memory address called "ptr". But i am confused due to the one output of a program:

1) Address of ptr: 123456 NOTE: Same address for both parent and child, so expected is if one process alters this address, it should reflect for the other too process too as the address is same.

2) Parent: *ptr=44

3) Child: *ptr=33

4) Printing values: Parent still retains old value: printf("ptr = %d",*ptr); // Output: still 44, exp is 33 Child prints 33, the expected value. printf("ptr = %d",*ptr);//Prints 33 fine

Question1) Can anyone tell me, how the values are different? Although the pointer address is same for both the parent and the child?

Question2) I am working on a memory leak tool which is giving double free, error as it is seeing parent and the child free the same address. However, it is not a case of double free, as we see. How to sort this problem? As the memory address the tool sees for the parent and the child is the same addrerss?

P.S: Please see the below code snippet:

#include <sys/types.h>
#include <unistd.h>
#include <cstdlib>
int main()
{
 int pid, *ptr
 ptr=(int*)malloc(sizeof(int));
 *ptr=33; // Parent keeps the data as 33, before forking.

 if(pid==0){*ptr=44;} // Child modifies data, which is ignored by parent

 // Now we print the memory address and the value both by child and parent
  if(pid==0)
  {
    printf("Child data: %u\n",*ptr);
    printf("Child address: %u\n",ptr);
  }
  if(pid>0)
  {
    printf("Parent data: %u\n",*ptr);
    printf("Parent address: %u\n",ptr);
  }
}

Output: Child data: 44 Child address: 123456

Parent data: 33 (how come still old value?) Parent address: 123456 (How come same address but data different than child?)

Upvotes: 1

Views: 1294

Answers (3)

kriss
kriss

Reputation: 24177

Even if you think of memory as a large buffer with individual addresses, there is more to it.

The above view is true enough for physical memory, but modern processors include a MMU chip (Memory Management Unit), this chip map pages of physical memory to virtual memory. The virtual memory is also used for mapping (virtual) memory addresses to disk (swap) when there is not enough physical memory on a given system for the use of running programs.

When running a C program in user space (or even a program writen in assembler) what you access to is virtual memory, and the addresses are addresses of virtual memory. To keep things simple for compilers and program loaders, on modern operating system every process has it's own independant memory address space and addresses spaces are unrelated between one another (like if every process could access to the whole memory space of the machine). Of course if the process access to some virtual memory page not mapped to physical memory (or swapped to disk) this will cause a "segmentation fault".

When a process is created using fork, the memory space of the father is duplicated in the child (ie: the is the same data at the same virtual addresses for both). After the fork they will diverge when memory is changed in one of the process and not in the other. The actual mechanism is slightlty more complex, it's usually copy-on-write, whenever a modification is performed on a memory page a copy of this page is done, if no change is made the two processes can access for reading at the same physical memory. That explains what you see when changing values in parent of child process : you see either was was put before forking (common between both processes) or different values if it was changed after the fork.

To make processes comunicate between one another you have to use some communication layer (socket, file, pipe, shared memory, etc.). And don't belive that using shared memory is especially simple of faster compared to the other methods, that is not true.

By the way, that's the difference between processes and threads. Every process has it's own memory while threads share the same memory space. What you thought true for processes (created by fork) is basically true for threads.

Shared memory space is also basically true for kernel level programming, but then fork wouldn't be available anyway.

Upvotes: 0

asveikau
asveikau

Reputation: 40246

if(pid==0){*ptr=44;} // Child modifies data, which is ignored by parent
Question1) Can anyone tell me, how the values are different? Although the pointer address is same for both the parent and the child?

This is the whole idea. They may have the same address, but these addresses are virtual. Each process has its own address space. What fork() does is creates a new process, and makes its virtual memory layout look like the parent.

See the Wikipedia article on page tables and similar topics for some illustrations of how this works.

-- (Long aside follows) --

What typically happens at a fork() is that page tables for both the parent and child are set up such that the pages are marked as read-only. When a write instruction happens for some location the kernel gets a page fault, which the CPU generates at a bad memory access. The kernel will allocate new memory for the trapped process, map it into the right address by manipulating its page table, copy the old buffer to the newly allocated one and let the write continue. This is called copy-on-write. This makes the initial fork quick and keeps memory consumption down for pages that are not written in either process.

The previous paragraph is all just an optimization of the fork programming model. They say early Unix didn't do this -- it did a full memory copy of the whole process. I've also heard that Cygwin's fork() does a full copy.

But the virtual address has nothing to do with the physical address of the memory. The CPU uses it as a "key" for the page table, which defines where the actual memory is. The page table may also say that the page is not valid, in which case the kernel has an opportunity to do a "fixup" (perform copy-on-write, recall the page from swap space, etc.) or kill the process in the case of a legitimately invalid pointer access.

Upvotes: 5

Volker Stolz
Volker Stolz

Reputation: 7402

You misunderstood how memory works in a Unix-like system: the memory of parent and child are independent. If you want them to communicate, you can set up explicitly shared memory, or IPC.

Upvotes: 2

Related Questions