KhaledWas
KhaledWas

Reputation: 73

Multi-thread process on multi-core or single-core double the speed?

Assume I have a process that consists of two ideally independent tasks (ideally, to remove the communication overhead). Would it be faster to do it on a single-core processor of 3GHz speed or two-core processor of 1.5GHz speed?

Of course, in case of two-core processor, the job is ideal for parallization. And for the single core, the two tasks will be time shared.

Update: questition in other words

A single-core processor of double the speed is always a better option than a two-core processor?

Upvotes: 2

Views: 521

Answers (2)

user2467198
user2467198

Reputation:

The question as posted is severely underspecified. First, it appears to confuse performance with processor frequency. Even with identical core microarchitectures, memory latencies are not fixed in cycle counts. Traversing a billion item linked list is a (contrived) workload that is dependent on memory latency, where two parallel "half-speed" threads would be faster than time-slicing.

If the lower frequencies are not the result of product binning, power-saving configuration, or the like but from a shallower pipeline (at the same width), then the "slower" processor would have a lower branch misprediction penalty and a lower latency in cycles to the same cache capacity, leading to higher instructions per cycle on most workloads.

Even with identical microarchitectures, two cores will also avoid cache warm-up context switch overhead. The cost of a context switch is not just the time taken to invoke the OS, run the OS scheduler (with only two active threads on two cores, the OS scheduler overhead would be slightly lower because there are no other ready threads but there would be twice as many timer interrupts), and swap register contents. (If run in batch mode, such context switch overhead would be avoided.)

Another factor to consider is whether the two tasks encounter independent bottlenecks. For example, if one task is extremely compute intensive but the other is bound by main memory bandwidth, then running them in parallel can provide better performance than time slicing; with time slicing the memory bandwidth potential is unused during the compute-intensive time slices.

Yet another factor is interference with constrained resources. For example, DRAM can suffer from bank conflicts which can substantially reduce effective bandwidth. If memory addressing and timing happen to cause maximum conflicts during parallel operation, then effective bandwidth would be reduced. A similar effect can be generated from limited associativity in a shared last level cache.

More recent processors are also tend to be thermally limited, so the double-frequency processor might not be able to sustain that frequency under maximum utilization if that frequency is not guaranteed under power virus conditions, whereas the alternative two-core system is likely not to encounter that power density constraint.

Upvotes: 2

Riad Baghbanli
Riad Baghbanli

Reputation: 3319

Ideally independent 2 tasks running on unideal OS like Windows 2012 will run faster on 2 cores at 1.5GHz, than on 1 core at 3GHz due to elimination of thread context switching overhead.

Unfortunately, there are very very few ideally independent tasks.

Upvotes: 3

Related Questions