How does the referencing of objects and variables in programs work?

Question

Disclaimer: I am not a very experienced guy, and many questions might seem stupid or badly phrased.

I have heard about stacks and heaps and read a bit about them, but still a few things I don't quite understand:

How does a program find empty memory to store new variables/objects in physical memory.
How does a program know where an object starts and where an object ends in memory. With number variables I can imagine there is a few extra information provided in memory that show the porgram how many bits the variable occupies, but correct me if I'm wrong.
This is similar to my first question, but: when a variable has a value representd only by zeros, how does the program not confuse that with free memory.
Does the object value null mean that the address of an object is a bunch of 0's or does the object point to litterally nothing? And if so, how is the "reference" stored to assign it an address later on?

user3344003 · Accepted Answer

How does a program find empty memory to store new variables/objects in physical memory.

Modern operating systems use logical address translation. A process sees a range of logical addresses—its address space. The system hardware breaks the address range into pages. The size of the page is system dependent and is often configurable. The operating system manages page tables that map logical pages to physical page frames of the same size.

The address space is divided into a range of pages that is the system space, shared by all processes, and a user space, that is generally unique to each process.

Within the user and system spaces, pages may be valid or invalid. An invalid page has not yet been mapped to the process address space. Most pages are likely to be invalid.

Memory is always allocated from the operating system image pages. The operating system will have system services that transform invalid pages into valid pages with mappings to physical memory. In order to map pages, the operating system needs to find (or the application needs to specify) a range of pages that are invalid and then has to allocate physical page frames to map to the those pages. Note that physical page frames do not have to be mapped contiguously to logical pages.

You mention stacks and heaps. Stacks and heap are just memory. The operating system cannot tell whether memory is a stack, heap or something else. User mode libraries for memory allocation (such as those that implement malloc/free) allocate memory in pages to create heaps. The only thing that makes this memory a heap is that there is a heap manager controlling it. The heap manager can then allocate smaller blocks of memory from the pages allocated to the heap.

A stack is simpler. It is just a contiguous range of pages. Typically an operating system service that creates a thread or process will allocate a range of pages for a stack and assign the hardware stack pointer register to the high end of the stack range.

How does a program know where an object starts and where an object ends in memory. With number variables I can imagine there is a few extra information provided in memory that show the porgram how many bits the variable occupies, but correct me if I'm wrong.

This depends upon how the program is created and how the object is created in memory. For typed languages, the linker binds variables to addresses. The linker also generates instruction for mapping those addresses to the address space. For stack/auto variables, the compiler generates offsets from a pointer to the stack. When a function/subroutine gets called, the compiler generates code to allocate the memory required by the procedure, which it does by simply subtracting from the stack pointer. The memory gets freed by simply adding that value back to the stack pointer.

In the case of typeless languages, such as assembly language or Bliss, the programmer has to keep track of the type for each location. When memory is dynamically, the programmer also has to keep track of the type. Most programming languages help this out by having pointers with types.

This is similar to my first question, but: when a variable has a value representd only by zeros, how does the program not confuse that with free memory.

Free memory is invalid. Accessing free memory causes a hardware exception.

Does the object value null mean that the address of an object is a bunch of 0's or does the object point to litterally nothing? And if so, how is the "reference" stored to assign it an address later on?

The linker defines the initial state of a program's user address space. Most linkers do not map the first page (or even more than one page). That page is then invalid. That means a null pointer, as you say, references absolutely nothing. If you try to dereference a null pointer you will usually get some kind of access violation exception

Most operating system will allow the user to map the first page. Some linkers will allow the user to override the default setting and map the first page. This is not commonly done as it makes detecting memory error difficult.

How does the referencing of objects and variables in programs work?

Answers (2)

Related Questions