Alan
Alan

Reputation: 13721

How much memory does a thread consume when first created?

I understand that creating too many threads in an application isn't being what you might call a "good neighbour" to other running processes, since cpu and memory resources are consumed even if these threads are in an efficient sleeping state.

What I'm interested in is this: How much memory (win32 platform) is being consumed by a sleeping thread?

Theoretically, I'd assume somewhere in the region of 1mb (since this is the default stack size), but I'm pretty sure it's less than this, but I'm not sure why.

Any help on this will be appreciated.

(The reason I'm asking is that I'm considering introducing a thread-pool, and I'd like to understand how much memory I can save by creating a pool of 5 threads, compared to 20 manually created threads)

Upvotes: 11

Views: 7337

Answers (6)

peterchen
peterchen

Reputation: 41096

Adding to Fabios comments:

Memory is your second concern, not your first. The purpose of a threadpool is usually to constrain the context switching overhead between threads that want to run concurrently, ideally to the number of CPU cores available.

A context switch is very expensive, often quoted at a few thousand to 10,000+ CPU cycles.

A little test on WinXP (32 bit) clocks in at about 15k private bytes per thread (999 threads created). This is the initial commited stack size, plus any other data managed by the OS.

Upvotes: 5

INS
INS

Reputation: 10820

I guess this can be measured quite easily.

  1. Get the amount of resources used by the system before creating a thread
  2. Create a thread with default system values (default heap size and others)
  3. Get the amount of resources after creating a thread and make the difference (with step 1).

Note that some threads need to be specified different values than the default ones.

You can try and find an average memory use by creating various number of threads (step 2).

The memory allocated by the OS when creating a thread consists of threads local data: TCB TLS ...

From wikipedia: "Threads do not own resources except for a stack, a copy of the registers including the program counter, and thread-local storage (if any)."

Upvotes: -1

Loki Astari
Loki Astari

Reputation: 264401

This depends highly on the system:

But usually, each processes is independent. Usually the system scheduler makes sure that each processes gets equal access to the available processor. Thus a multi threaded application time is multiplexed between the available threads.

Memory allocated to a thread will affect the memory available to the processes but not the memory available to other processes. A good OS will page out unused stack space so it is not in physical memory. Though if your threads allocate enough memory while live you could cause thrashing as each processor's memory is paged to/from secondary device.

I doubt a sleeping thread has any (very little) impact on the system.

  • It is not using any CPU
  • Any memory it is using can be paged out to a secondary device.

Upvotes: 0

dkretz
dkretz

Reputation: 37655

I think you'd have a hard time detecting any impact of making this kind of a change to working code - 20 threads down to 5. And then add on the added complexity (and overhead) of managing the thread pool. Maybe worth considering on an embedded system, but Win32?

And you can set the stack size to whatever you want.

Upvotes: 0

stephbu
stephbu

Reputation: 5082

If you're using Vista or Win2k8 just use the native Win32 threadpool API. Let it figure out the sizing. I'd also consider partitioning types of workloads e.g. CPU intensive vs. Disk I/O into different pools.

MSDN Threadpool API docs

http://msdn.microsoft.com/en-us/library/ms686766(VS.85).aspx

Upvotes: 1

Fabio Ceconello
Fabio Ceconello

Reputation: 16049

I have a server application which is heavy in thread usage, it uses a configurable thread pool which is set up by the customer, and in at least one site it has 1000+ threads, and when started up it uses only 50 MB. The reason is that Windows reserves 1MB for the stack (it maps its address space), but it is not necessarily allocated in the physical memory, only a smaller part of it. If the stack grows more than that a page fault is generated and more physical memory is allocated. I don't know what the initial allocation is, but I would assume it's equal to the page granularity of the system (usually 64 KB). Of course, the thread would also use a little more memory for other things when created (TLS, TSS, etc), but my guess for the total would be about 200 KB. And bear in mind that any memory that is not frequently used would be unloaded by the virtual memory manager.

Upvotes: 8

Related Questions