Reputation: 30615

What is the relationship between assembly and multi-core?

This is hard to word/ask so please bear with me:

When we see the output of assembly, this is what is going to be executed on the core(s) of the CPU. However, if a CPU has multiple cores- is all of the assembly executed on the same core? At what point would the assembly from the same program begin executed on a different core?

So if I had (assembly pseudo):

ADD x, y, z
SUB p, x, q

how will I know whether ADD and SUB will execute on the same core? Is this linked to affinity? I thought affinity only pinned a process to a CPU, not a core?

I am asking this because I want to try and understand whether you can reasonably predict whether consecutive assembly instructions execute on the same core and whether I can control that they only execute on the same core. I am trying to understand how the decision is made to change executing the same program code from one core, to a different core?

If assembly can change execution (even when using affinity) from CPUA Core1 to Core2, is this where QPI link speed will take effect- and also whether the caches are shared amongst the different CPU cores?

Upvotes: 1

Answers (3)

Basile Starynkevitch

Reputation: 1

^{I'm talking mostly about Linux; but I guess what I am saying should be applicable to other OSes. However, without access to Windows source code, no one can reliably say how it behaves in its detail}

I think your "abstraction" of what a computer is doing is inadequate. Basically, a (mono-threaded) process (or just a thread) is running on some "virtual" CPU, whose instruction set is the unpriviledged x86 machine instructions augmented by the ability to enter the kernel thru syscalls (usually, thru a special instruction like SYSENTER). So from an application point of view, system calls to the linux kernel are "atomic". See this and that answers.

Indeed, the processor is getting (at arbitrary instants) some interrupts (on Linux, cat /proc/interrupts repeated twice with a one-second delay would show you how often it is getting interrupted, basically many thousand times per second), and these interrupts are handled by the kernel. The kernel is scheduling tasks (e.g. threads or processes) premptively (they can be interrupted and restarted by the kernel at any time).

From an application point of view, interrupts don't really exist (but the kernel can send signals to the process).

Cores, interrupts and caches are handled by the hardware and/or the kernel, so from the application point of view, they don't really exist -except by "slowing down" the process. cache coherency is mostly dealt with in hardware, and with out-of-order execution makes a given -even tiny- binary program execution time unpredictable. (in other words, you cannot statically predict exactly how many milliseconds some given routine or loop will need; you can only dynamically measure that; read more about worst-case execution time).

Reading the Advanced Linux Programming book and the Linux Assembly Howto would help.

Upvotes: 2

Euro Micelli

Reputation: 33998

You cannot normally predict where each individial instruction will execute. As long as an individual thread is executing continuously, it will run inside of the same core/processor, but you cannot predict on which instruction the thread will be switched-out. The OS makes that decision, the decision of when to switch it back in, and on which core/processor to put it, based on the workload of the system and priority levels, among other things.

You can usually request to the OS specifically that a thread should always run on the same core, this is called affinity. This is normally a bad idea and it should only be done when absolutely necessary because it takes away from the OS the flexibility to decide what to do where, based on the workload; affinity will almost always result in a performance penalty.

Requesting processor-affinity is an extraordinary request that requires extraordinary proof that it would result in better performance. Don't try to outsmart the OS; the OS knows things about the current running environment that you don't know about.

Upvotes: 1

TAS

Reputation: 2079

This is a rough overview that hopefully will provide you with the details you need.

Assembly code is translated into machine code; ie binary data, that is run by a CPU.
A CPU is the same as a core on a multi-core processor; ie a CPU is not the same as a processor (chip).
Every CPU has an instruction pointer that points to the instruction to execute next. This is incremented for every instruction executed.

So in a multi-core processor you would have one instruction pointer per core. To support more processes than there are available CPUs (or cores), the operating system will interrupt running processes and store their state (including the instruction pointer) at regular intervals. It will then restore the state of already interrupted processes and let them execute for a bit.

What core the execution is continued on is up to the operating system to decide, and is controlled by the affinity of the running thread (and probably some other settings also).

So to answer your question, there is no way of knowing if two adjacent assembly statements will run on the same core or not.

Upvotes: 2

What is the relationship between assembly and multi-core?

Answers (3)

Related Questions