arunmoezhi
arunmoezhi

Reputation: 3200

Number of multiplications per clock cycle on Intel Xeon Phi

In Intel Xeon Phi there are 32 512-bit-wide vector registers per core. Each vector register can do 16 single precision floating point operation per cycle. And 2 operations can be done in 1 cycle (1 in the v-pipe and 1 in the u-pipe).

I want to know how many scalar multiplications can be done in 1 clock cycle apart from the vector multiplications done in the vector registers.

Upvotes: 0

Views: 455

Answers (1)

sssylvester
sssylvester

Reputation: 168

Some misconceptions there. There is 1 vector unit per core. Registers store values, they do not compute. So you can issue 1 512 byte wide vector operation per cycle per core. You can do a scalar multiply in 1 cycle as well. You cannot issue both at the same time. Using the u&v pipes you can issue one vector or scalar operation and then a memory operation in the other pipe. You can do a fused multiply-add (MADD) instruction per cycle as well which effectively gives you 2 vector operations per cycle per core.

Upvotes: 1

Related Questions