Reputation: 157
I learned that linux kernel manages the memory and the unit for allocate/deallocate the memory is 4KB, which is the page size. And I know that this pages are handled by struct page. I got a actual code here.
struct page {
unsigned long flags; /* Atomic flags, some possibly
* updated asynchronously */
/*
* Five words (20/40 bytes) are available in this union.
* WARNING: bit 0 of the first word is used for PageTail(). That
* means the other users of this union MUST NOT use the bit to
* avoid collision and false-positive PageTail().
*/
union {
struct { /* Page cache and anonymous pages */
/**
* @lru: Pageout list, eg. active_list protected by
* pgdat->lru_lock. Sometimes used as a generic list
* by the page owner.
*/
struct list_head lru;
/* See page-flags.h for PAGE_MAPPING_FLAGS */
struct address_space *mapping;
pgoff_t index; /* Our offset within mapping. */
/**
* @private: Mapping-private opaque data.
* Usually used for buffer_heads if PagePrivate.
* Used for swp_entry_t if PageSwapCache.
* Indicates order in the buddy system if PageBuddy.
*/
unsigned long private;
};
struct { /* page_pool used by netstack */
/**
* @dma_addr: might require a 64-bit value even on
* 32-bit architectures.
*/
dma_addr_t dma_addr;
};
struct { /* slab, slob and slub */
union {
struct list_head slab_list;
struct { /* Partial pages */
struct page *next;
#ifdef CONFIG_64BIT
int pages; /* Nr of pages left */
int pobjects; /* Approximate count */
#else
short int pages;
short int pobjects;
#endif
};
};
struct kmem_cache *slab_cache; /* not slob */
/* Double-word boundary */
void *freelist; /* first free object */
union {
void *s_mem; /* slab: first object */
unsigned long counters; /* SLUB */
struct { /* SLUB */
unsigned inuse:16;
unsigned objects:15;
unsigned frozen:1;
};
};
};
struct { /* Tail pages of compound page */
unsigned long compound_head; /* Bit zero is set */
/* First tail page only */
unsigned char compound_dtor;
unsigned char compound_order;
atomic_t compound_mapcount;
};
struct { /* Second tail page of compound page */
unsigned long _compound_pad_1; /* compound_head */
atomic_t hpage_pinned_refcount;
/* For both global and memcg */
struct list_head deferred_list;
};
struct { /* Page table pages */
unsigned long _pt_pad_1; /* compound_head */
pgtable_t pmd_huge_pte; /* protected by page->ptl */
unsigned long _pt_pad_2; /* mapping */
union {
struct mm_struct *pt_mm; /* x86 pgds only */
atomic_t pt_frag_refcount; /* powerpc */
};
#if ALLOC_SPLIT_PTLOCKS
spinlock_t *ptl;
#else
spinlock_t ptl;
#endif
};
struct { /* ZONE_DEVICE pages */
/** @pgmap: Points to the hosting device page map. */
struct dev_pagemap *pgmap;
void *zone_device_data;
/*
* ZONE_DEVICE private pages are counted as being
* mapped so the next 3 words hold the mapping, index,
* and private fields from the source anonymous or
* page cache page while the page is migrated to device
* private memory.
* ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
* use the mapping, index, and private fields when
* pmem backed DAX files are mapped.
*/
};
/** @rcu_head: You can use this to free a page by RCU. */
struct rcu_head rcu_head;
};
union { /* This union is 4 bytes in size. */
/*
* If the page can be mapped to userspace, encodes the number
* of times this page is referenced by a page table.
*/
atomic_t _mapcount;
/*
* If the page is neither PageSlab nor mappable to userspace,
* the value stored here may help determine what this page
* is used for. See page-flags.h for a list of page types
* which are currently stored here.
*/
unsigned int page_type;
unsigned int active; /* SLAB */
int units; /* SLOB */
};
/* Usage count. *DO NOT USE DIRECTLY*. See page_ref.h */
atomic_t _refcount;
#ifdef CONFIG_MEMCG
struct mem_cgroup *mem_cgroup;
#endif
/*
* On machines where all RAM is mapped into kernel address space,
* we can simply calculate the virtual address. On machines with
* highmem some memory is mapped into kernel virtual memory
* dynamically, so we need a place to store that address.
* Note that this field could be 16 bits on x86 ... ;)
*
* Architectures with slow multiplication can define
* WANT_PAGE_VIRTUAL in asm/page.h
*/
#if defined(WANT_PAGE_VIRTUAL)
void *virtual; /* Kernel virtual address (NULL if
not kmapped, ie. highmem) */
#endif /* WANT_PAGE_VIRTUAL */
#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
int _last_cpupid;
#endif
} _struct_page_alignment;
And I have no idea where the linux kernel stores this huge(?) structure.
There are a lot of pages handled in the linux kernel and that means that we have a lot of this struct page
structures. Where can it be stored on the memory?
Also I have no idea what the union up there is up for.
Upvotes: 4
Views: 6035
Reputation: 50
If you want to know exact place where it is stored, this article might give you the answer.
It says that they are "usually stored at the beginning of ZONE_NORMAL".
Upvotes: 0
Reputation: 329
I have the same question as you recently. Where is the struct page in Linux kernel?
And I think I can give you some helpful informations. Every node stores pages into pg_data_t's node_mem_map, and there is a global variable mem_map, which is pointing to Node 0's page array.
You can find more details from Professional Linux Kernel Architecture. In chapter 3, the section is "Creating Data Structures for Each Node".
Upvotes: 3
Reputation: 1494
First, there are several memory models like FMM, SMP, and NUMA. The knowledge about virtual memory, page and page tables, sturct of kernel and user space memory may not write here because there are too much more than the limitation of the answer's length and I think you can learn it from any books.
Let's use NUMA as an example. In NUMA, every CPU will have a node : struct pglist_data *node_data
, and this struct has many Zones like ZONE_DMA, ZONE_DMA32, ZONE_NORMAL, ZONE_HIGHMEM, ZONE_MOVALBE
, every Zone has many struct free_area
, and this struct includes a list of struct page
.
Why we need to use page
? It is because our physical memory is limit, so we create virtual memory. And then we need a mechanism to load virtual memory to physical memory to run the task(process or thread). So we use page as the meta entry, and use some models like NUMA to control those pages and page tables.
When we need to allocate some memory, we will use buddy system and slab/slub allocator to allocate memory pages and add them to a zone that can use. When the physical memory loads too much pages, it will use get_page_from_freelist()
or kswapd
to swap some out.
Auctally, memory and addresses space is a essential part of Linux kernel. I suggest you to read some books like CSAPP to get a relatively deep understand of OS, and then reading Linux kernel source codes to dive into them.
Upvotes: 3