Number of memory access with Demand Paging

I have been studying Operating Systems Concepts and the book I am referring to is Operating System Concepts by Peter B. Galvin, Greg Gagne and Abraham Silberschatz.

In the chapter of Virtual Memory, book starts to talk about Paging and number of memory access it would require for the system to read data stored in a particular frame in memory given a logical address. The author states that when Page Table is present in Main Memory, system would need two memory accesses to read data stored in a frame. The first access is made to the page table to read the correct frame number and the next access is for reading the byte/word from the frame.

After a few sections, the book talks about Demand Paging and page fault. Author state that in case of no page fault, one memory access is needed and in case of a page fault, we will consider Page Fault Service time (which comprises of swap in time, swap out time, one memory access etc.) and presents readers with the formula

Effective Access Time = (1-p) x one memory access time + p x page fault service time

where p = page fault rate

I cannot wrap my head around why the author suggests that, in case of no page fault, only one memory access will be needed. Applying the line of thought used with standard paging scheme earlier introduced by same author(s), we should need one memory access to read page table and another to read the data from frame.

Is it because we are talking about the time frame after the access to page table is made? Then why the same standard of calculation not applies to standard version of paging?

Upvotes: 3

Answers (2)

user3344003

Reputation: 21712

The fundamental source of your problem is that you are reading a book that is only fit for lining a cat box. What you are describing is nonsensical gibberish that textbooks use to create confusion among students. This is not a case of over simplification because the authors apparently throw in a nonsensical formula for access times.

A formula like this

Effective Access Time = (1-p) x one memory access time + p x page fault service time

is total bovine fecal waste matter with no basis in reality.

The author states that when Page Table is present in Main Memory, system would need two memory accesses to read data stored in a frame.

The processor has to translate logical addresses to physical addresses using the page tables. Assuming that there is no caching in the CPU, the CPU has read the page table for each memory access.

The number reads depends upon the page table format used by the CPU.

Let's suppose your process has a multi-level page table. In that case the CPU has to make a read for each level of the table.

If you have a CPU that has separate linear system and user page tables, with the user tables in logical addresses, each access to the system space requires one memory read and each access to the user space requires at least two memory accesses and might, in fact, trigger a page fault. The first read is to system page table to find the user page table entry. The second read is to the user page table. The third is to the data.

In reality, every CPU on the planet does page table caching so separate reads are not required (all the time).

I cannot wrap my head around why the author suggests that, in case of no page fault, only one memory access will be needed.

It sounds like the book is not being consistent in its BS.

The reality is that logical memory translation requires a number of steps. However, what those steps are depends upon the state of the processor, something that is unpredictable. These steps take place transparently behind the scenes and you do not even need to grasp all of them to understand operating systems.

What you need to know in the real world is that the CPU translates logical addresses to physical addresses. If the CPU is unable to make that translation, it triggers a page fault.

Upvotes: 0

Brendan

Reputation: 37262

Note: I haven't read/seen this book.

For educational material; if the author describes reality accurately with all the details the reader will just get confused and won't be able to learn. To work around that, authors simplify (omit details and ignore reality) while introducing different concepts, so that the reader is able to learn each concept one at a time while building up the knowledge needed to comprehend the complexity of reality.

The problem is that different simplifications make sense at different stages, and authors are human (imperfect), so sometimes the simplifications that were beneficial at one point (in one chapter) conflict with simplifications that are beneficial at a later point (in a different chapter).

For an example, I might (initially) tell someone "each access from virtual memory involves a second memory fetch from RAM to determine the translation" to help them understand how page tables work and that there's (potential) performance problems involved (twice as many memory accesses). Then I might introduce the concept of "translation look-aside buffers" (after the reader understands the how page tables work and knows about the problem that TLBs are designed to solve). Then I might explain that often real systems have multiple levels of page tables (e.g. on 64-bit 80x86 it's four levels, potentially involving 4 memory accesses to determine a translation) and that there might be higher level caches/buffers involved (and not just TLBs that cache final translations). In this case, my original statement ("each access from virtual memory involves a second memory fetch from RAM to determine the translation") is a deliberate lie (a simplification) to avoid the complexity of a statement like "each access from virtual memory may or may not involve one or more additional fetches from some or all levels of page tables" (which is too confusing for beginners initially, because it creates lots of questions that they don't have answers to yet).

I cannot wrap my head around why the author suggests that, in case of no page fault, only one memory access will be needed.

One reality is (for one real 80x86 CPU in long mode but not all 80x86 CPUs in long mode and not any 80x86 in other modes, if virtualisation is not being used), for a read from virtual memory that does not lead to a page fault, if the access is not misaligned/split across page boundaries (where CPU would have to do it all twice to fetch bytes from 2 different pages and merge the bytes):

    * if the translation is not in the TLB, then:
        * if the area is not in the "page directory cache"
            * fetch the PML4 entry to determine address of PDPT (try L1 cache, then L2 cache, then L3 cache, then RAM)
            * do access checks based on flags in PML4 entry
            * fetch the PDPT entry to determine address of PD (try L1 cache, then L2 cache, then L3 cache, then RAM)
            * do access checks based on flags in PDPT entry
            * insert data into "page directory cache"
        * if the area is in the "page directory cache"
            * do access checks based on flags in "page directory cache entry"
        * fetch the PD entry to determine address of PT (try L1 cache, then L2 cache, then L3 cache, then RAM)
        * do access checks based on flags in PD entry
        * fetch the PT entry to determine address of page (try L1 cache, then L2 cache, then L3 cache, then RAM)
        * do access checks based on flags in PT entry
        * insert data into TLB (including setting the "accessed" flag in the page table entry)
    * if the translation is in the TLB, then:
        * do access checks based on flags in "TLB entry"
    * do the "physical address = physical address of page + offset in page" calculation
    * read the data for the physical address (try L1 cache, then L2 cache, then L3 cache, then RAM)

For this reality (with the restrictions mentioned); the number of fetches from RAM can be anything from zero to 5.

Can you see why the author (while trying to explain page faults and not trying to explain translation costs) might want to avoid showing something like this and might simplify (by assuming that only one fetch is needed because the translation is in the TLB) instead?

Upvotes: 1

Number of memory access with Demand Paging

Answers (2)

Related Questions