CPU organization

I have studied about following three types of CPU organization::-

Single accumulator organization
General register organization
Stack organization

I'm also know most computer falls into one of three types of organization and some have combined feature.

I want know, is our modern and ordinary computers[such as laptops, mobiles] use combined features ?

And by which organization arithmetic operations are performed in those ordinary and modern computers ?

Please help me really want to know about it.

Upvotes: -5

Answers (1)

Peter Cordes

Reputation: 365951

Pretty much every mainstream ISA is a register machine. (Assembly: Why are we bothering with registers?)

Stack and accumulator machines can have smaller instructions (more implicit, fewer explicit operands), but that's not worth the cost in extra data load/store instructions. Even with cache, load and store-forwarding latency for accessing a memory address selected by a pointer register is much higher latency (like 4 to 5 cycles) than simply reading a register whose register number is encoded into the instruction directly.

Having multiple orthogonal registers also makes it easy for software to expose instruction-level parallelism to the hardware (without needing stuff like x87 fxch to swap a stack register to the top-of-stack). This can let the latency of independent operations overlap in pipelined and especially out-of-order execution CPUs.

See also https://www.realworldtech.com/architecture-basics/2/ for a history of the move toward load-store architectures, away from stack and accumulator machines, with super-basic diagrams of each. (Some CISCs like x86 allow reg,mem for ALU instructions, not just for loads and stores.)

Some old 8-bit micros like 6502 or 8080 could be argued to be an accumulator machine, but they do have some other registers, e.g. to hold a pointer. Instructions for those ISAs have only 1 explicit operand, with the other implicit according to the opcode. Like ORA src is an ORA into the A register (accumulator). But those ISAs are not modern.

If by "ordinary computers" like laptops, you mean mainstream x86, see What kind of address instruction does the x86 cpu have?. For its complete instruction set, see https://www.felixcloutier.com/x86/ for an extract of Intel's manual that lists all the instructions it supports. It has 16 general-purpose integer registers (in 64-bit mode), and 16 FP / SIMD-vector registers. (And a legacy x87 FPU that interestingly uses a register-stack).

I think I misread the question when I wrote this answer; I thought it was asking which arithmetic operations CPUs supported directly. That's what the rest of this answer is answering. I'll leave it here for anyone interested in a rambling overview of neat stuff some CPUs can do in one hardware operation.

But actually I think it's just asking about instruction operands, in which case it's simple: almost everything is a 2 or 3-operand register machine, except x86's legacy x87 FPU which is a register-stack architecture.

Modern ISAs in general have instructions for all the basic integer bitwise shift (and sometimes rotate), and integer + - * / (and remainder from division). Also usually some kind of hardware bit-scan, often a popcount. Some have a bit-reverse and/or byte-reverse, like x86 bswap or ARM rbit / rev. Most ISAs support extended-precision arithmetic efficiently with add-with-carry and sub-with-borrow instructions, and often a widening multiply like 64x64 bit inputs => 128-bit product.

As transistor budgets have increased, having special-purpose execution units that are idle most of the time is fine; current transistor densities need a significant fraction of the die area to be "dark silicon" at any given moment to not melt.

Fancier combinations of shifting / rotating / masking are sometimes found in a single instruction, like bitfield insert / extract instructions in some ISAs. PowerPC is notably excellent here, with instructions like rlwinm that rotate a register left by an immediate, and mask it to clear bits except between two positions specified by 2 other immediates. Or another variant can insert that bitfield into another register at an arbitrary position instead of extracting to an arbitrary position in a zeroed register.

Most of the things that Rust has builtins for with primitive integer types like i32 are supported fairly directly by at least some ISA, although AFAIK not integer pow. (Rust is nice like this, unlike some languages like C and C++ that refuse to portably expose modern CPU features.) Integer absolute value is rare, although it can be done branchlessly in only a couple instructions on most ISAs (which is presumably why they don't bother to provide an instruction for it). Support for saturating integer arithmetic is also rareish outside of DSPs. On x86 it's only available for some SIMD-integer sizes.
On most ISAs, add/sub/mul simply wrap by truncating the result to the fixed width of a register. (With carry-out (optionally) going into a flag bit if there is one). Most ISAs have ways to check for signed overflow after the fact.

x86 with BMI2 even has pext/pdep for bit pack/unpack according to a mask. https://www.felixcloutier.com/x86/PDEP.html, and see AVX2 what is the most efficient way to pack left based on a mask? for a use-case.

x86 also has a built-in true (not pseudo) RNG, via rdrand / rdseed.

x86 with AVX (for vmaskmov) and especially AVX512 (masking for anything) has support for masked loads and even stores, which conditionally don't actually store, according to a mask in another vector register. ARM can do something similar for scalar with predicated instructions that execute like a NOP if the predicate (flag condition) is false. Normally you need to branch if you need to maybe not store.

ISAs with SIMD shuffles that take their control operand from another vector can use those to do for example 16x 4-bit LUT lookups in parallel from a vector of 16 bytes. This can be used to e.g. vectorize a popcount, or do other things like vectorize a Galois-field multiply. Or to translate a 0..15 integer to its appropriate hex ASCII digit: How to convert a binary integer number to a hex string? shows how you can use x86 SIMD to do that efficiently.

Other SIMD operations include SAD (sum of absolute differences), heavily used by motion search in video codecs. Or against a vector of all zeros for a horizontal sum of unsigned bytes.

Some ISAs like x86 have support for carryless-multiply. (Like a regular multiply but with XOR instead of + for "adding" the shifted partial products.)

Of course most modern ISAs have an FPU which can do all the IEEE basic operations with the required <= 0.5ulp of error: + - * / and sqrt. And often FMA.

I'm sure I've left some integer things out, and we'd be here all day if I tried to list everything kind of arithmetic operation that x86 with AVX512 can do. There's seriously a lot of very specific instructions.

Upvotes: 0

CPU organization

Answers (1)

Related Questions