Reputation: 167
I am not so much interested to know about the "small print" the differences while developing code on each platform in terms of what a programmer is used to or what he finds easier to do etc. Nor am I interested in the detailed physical differences in the core (I dont mind them to be mentioned if it suits your narrative I just dont want to focus on the above)
I am just searching about why CISC architecture such as the x86 is superior to RISC architecture or is it not?
I mean why to be "Complex" (CISC) if you can do everything just as well with being Reduced in complexity (RISC)
Is there something that x86 can do that ARM can not? if there isnt anything then why did we bother (historically) on developing CISC and didnt focus on RISC?
Today ARM seems to do everything an Intel computer does they even have server oriented designs...
It bobs my uncle..
Upvotes: 4
Views: 8237
Reputation: 363852
This is part of an answer I wrote for Could a processor be made that supports multiple ISAs? (ex: ARM + x86) (originally posted here when that was closed, now I've edited this down to keep just the parts that answer this question)
This is not an exhaustive list of differences, just some key differences that make building a bi-arch CPU not as easy as slapping a different front-end in front of a common back-end design. (I know that wasn't the aspect this question intended to focus on).
The more different the ISAs, the harder it would be. And the more overhead it would cost in pipeline, especially the back-end.
A CPU that could run both ARM and x86 code would be significantly worse at either one than a pure design that only handles one.
efficiently running 32-bit ARM requires support for fully predicated execution, including fault suppression for loads / stores. (Unlike AArch64 or x86, which only have ALU-select type instructions like csinc
vs. cmov
/ setcc
that just have a normal data dependency on FLAGS as well as their other inputs.)
ARM and AArch64 (especially SIMD shuffles) have several instructions that produce 2 outputs, while almost all x86 instructions only write one output register. So x86 microarchitectures are built to track uops that read up to 3 inputs (2 before Haswell/Broadwell), and write only 1 output (or 1 reg + EFLAGS).
x86 requires tracking the separate components of a CISC instruction, e.g. the load and the ALU uops for a memory source operand, or the load, ALU, and store for a memory destination.
x86 requires coherent instruction caches, and snooping for stores that modify instructions already fetched and in flight in the pipeline, or some way to handle at least x86's strong self-modifying-code ISA guarantees (Observing stale instruction fetching on x86 with self-modifying code).
x86 requires a strongly-ordered memory model. (program order + store buffer with store-forwarding). You have to bake this in to your load and store buffers, so I expect that even when running ARM code, such a CPU would basically still use x86's far stronger memory model. (Modern Intel CPUs speculatively load early and do a memory order machine clear on mis-speculation, so maybe you could let that happen and simply not do those pipeline nukes. Except in cases where it was due to mis-predicting whether a load was reloading a recent store by this thread or not; that of course still has to be handled correctly.)
A pure ARM could have simpler load / store buffers that didn't interact with each other as much. (Except for the purpose of making stlr
/ ldar
release / acquire cheaper, not just fully stalling.)
Different page-table formats. (You'd probably pick one or the other for the OS to use, and only support the other ISA for user-space under a native kernel.)
If you did try to fully handle privileged / kernel stuff from both ISAs, e.g. so you could have HW virtualization with VMs of either ISA, you also have stuff like control-register and debug facilities.
So does this mean that the x86 instructions get translated to some weird internal RISC ISA during execution?
Yes, but that "RISC ISA" is not similar to ARM. e.g. it has all the quirks of x86, like shifts leaving FLAGS unmodified if the shift count is 0. (Modern Intel handles that by decoding shl eax, cl
to 3 uops; Nehalem and earlier stalled the front-end if a later instruction wanted to read FLAGS from a shift.)
Probably a better example of a back-end quirk that needs to be supported is x86 partial registers, like writing AL and AH, then reading EAX. The RAT (register allocation table) in the back-end has to track all that, and issue merging uops or however it handles it. (See Why doesn't GCC use partial registers?).
See also Why does Intel hide internal RISC core in their processors? - that RISC-like ISA is specialized for executing x86, not a generic neutral RISC pipeline like you'd build as a back-end for an AArch64 or RISC-V.
Upvotes: 5
Reputation: 179749
You're trying to re-start a debate that ended 20 years ago. ARM is not RISC anymore and x86 is not CISC anymore.
That said, the reason for CISC was simple: if you could execute 100.000 instructions per second, the CPU which needed the least instructions for a given task would win. One complex instruction would be better than 2 simple instructions.
RISC is based on the observation that as CPU's became faster, the time needed would vary a lot between instructions. Two simple instructions might in fact be faster than one complex, especially when you optimized the CPU for simple instructions.
Upvotes: 8