Reputation: 3031
I'm very curious how assembly languages work- I remain general because I'm not talking only about intel x86 assembly (although it's the only one I'm remotely familiar with). To be a bit more clear...
mov %eax,%ebx
How does the computer know what an instruction like "mov" does? How does it know that eax and ebx are registers? Do people write grammars for assembly languages? How do they write this? I imagine nothing is stopping someone from writing an assembly language that substitutes the mov
instruction with something like dog
or horse
etc., (obviously this isn't semantic at all)
Sorry if this isn't too clear, but it's something I find a bit puzzling- I know it can't be magic, but I can't see how it works. I've looked up some stuff on wikipedia, but all it seems to say is it translates it down to machine code, well, what I'm asking is how that translation occurs I suppose.
Thoughts?
EDIT: I realize that this stuff is defined in reference manuals and things, I guess what I wish to know is how you tell your processor "Okay, when you see mov
you're gonna do this". I also know that it's a sequence of probably a ton of logic gates..but there has to be some way for the processor to recognize is that mov
is the symbol that means "use these logic gates"
Upvotes: 14
Views: 22036
Reputation: 31
I look everywhere for the answer and finally found it
http://1.bucarotechelp.com/computers/architecture/86011102.asp
So the Decoder in the example below, can output 8 different instructions depends on the binary of the A B C for example
ONLY 1 path will take the ON Value and the rest 0 So that on Value will go to alu to turn on the path for adding 2 values for example....
So in general that Decoder will not accept and code other than the codes it is set to accept. Or it maybe manufactured to accept 32 functions but the ALU doesn't accept all so both have to work together to accept all while manufacturing.
So basically MOV is in binary , the Decoder (which is circuits giving many different paths depends on opcode binary giving to it, which then uses logic gates to play with these binaries, eventually it will reach a binary for the actual mov command(4 electrical signals or depends on the ALU architecture which depends on the factory making it in general) which is turn on the ALU from the side ,. To deal with the 8 values. 4 left and 4 right entering it from top . To give the result after the alu .
Upvotes: 0
Reputation: 81
first thing every instruction like mov ,add etc have its own meaning in binary form like 10101010, 00110000, 10100 some of these also be, which cpu always understands.
but human cant remember all of them. so... for programming purpose that used in english language. which will ultimately come to its own place(binary).
second thing conversion from english(mov, add etc.) to binary occurs at, when assembling or compiling those codes. after that- binary instructions(instruction sets) stored or loaded into ram and ready for execution.
but it may be not your answer i know.
if you want know and imagine perfectly- how does cpu exucute instructions and work on them. You can learn it with graphics here. see this video on youtube: (link given here)
watch it once and i promise you. you will more clear about it. just have a look right.
Upvotes: 3
Reputation: 26171
What you see there are mnemonics, which make it easy for a programmer to write assembly; it is however not executable in mnemonic form. When you pass these assembly instructions through an assembler, they are translated into machine code they represent, which is what the CPU and its various co-processors interpret and execute (it's generally taken down into smaller units by the CPU, called micro-ops).
If you're curious as to how exactly it does that, well that's a long process, but this has all that information.
All the semantics, etc. are handled by the assembler, which checks for validity and integrity where possible (one can still assemble invalid code however!). This basically makes assembly a low-level language, even though it has a 1 to 1 correlation to the outputted machine code (except when using macro based assemblers, but then the macros still expand to 1 to 1).
Upvotes: 11
Reputation: 177520
Your CPU doesn’t execute assembly. The assembler converts it into machine code. This process depends on both the particular assembly language and the target computer architecture. Generally those go hand in hand, but you might find different flavors of assembly language (nasm vs. AT&T, for example), which all translate into similar machine code.
A typical (MIPS) assembly instruction such as “And immediate”
andi $t, $s, imm
would become the 32-bit machine code word
0011 00ss ssst tttt iiii iiii iiii iiii
where s
and t
are numbers from 0–31 which name registers, and i
is a 16-bit value. It’s this bit pattern that the CPU actually executes. The 001100
in the beginning is the opcode corresponding to the andi
instruction, and the bit pattern that follows — 5-bit source register, 5-bit target register, 16-bit literal — varies depending on the instruction. When this instruction is placed into the CPU, it responds appropriately by decoding the opcode, selecting the registers to be read and written, and configuring the ALU to perform the necessary arithmetic.
Upvotes: 10
Reputation: 9326
Computers are basically built out of logic gates. Though this is an abstract idealization of the real physical machinery, it is close enough to the truth that we can believe it for now. At a very basic level, these things work just like true/false predicates. Or if you've ever played minecraft, it works a lot like redstone. The field which studies how to put together logic gates to make interesting complex circuits, like computers, is called computer architecture. It is traditionally viewed as a mixture of computer science and electrical engineering.
The most basic logic gates are things like AND, and OR which just take bits together and smash out some boolean operation between them. By creating feed back loops in logic gates you can store memory. One type of standard memory circuit is called a flip-flop, and it is basically a little loop of wire together with some AND gates and power to keep it stable. Putting together multiple latches lets you create bit vectors, and these things are called registers (which are what things like eax and ebx represent). There are also many other types of parts, like adders, multiplexors and so on which implement various pieces of boolean logic. Here is a directory of some circuits:
http://www.labri.fr/perso/strandh/Teaching/AMP/Common/Strandh-Tutorial/Dir.html
Your CPU is basically a bunch of these things stuck together, all built out of the same basic logic gates. The way that your computer knows how to keep on executing instructions is that there is a special piece of machinery called a clock which emits pulses at regular intervals. When your CPU's clock emits a pulse it sets off a sequence of reactions in these logic gates that causes the CPU to execute an instruction. For example, when it reads an instruction that says "mov eax, ebx", what ends up happening is that the state of one of these registers (ebx) gets copied over to the state of another (eax) just in time before the next pulse of comes out of the clock.
Of course this is a gross oversimplification, but as a high level picture it is essentially correct. The rest of the details take awhile to explain, and there are a few things here that I neglected due to unnecessary subtlety (for example, in a real CPU sometimes multiple instructions get executed in a single clock; and due to register paging sometimes eax isn't always the same thing; and sometimes due to reordering occasionally the way that instructions get executed gets moved around, and so on). However, it is definitely worth learning the whole story since it is actually quite amazing (or at least I like to think so!) You would be doing yourself a great favor to go out and read up on this stuff, and maybe try building a few circuits of your own (either using real hardware, a simulator, or even minecraft!)
Anyway, hope that answers a bit of your question about what mov eax, ebx does.
Upvotes: 21
Reputation: 206669
The instructions in assembly code map to the actual instruction set and register names for the CPU architecture you're targeting. mov
is an X86 instruction, and eax
and others are the names of (in this case general purpose) registers defined it the Intel x86 reference manual.
Same thing for other architectures - the assembly code maps quite directly to the actual names of the operations as defined in the chip's specifications/documentation.
That mapping is way more simple than for instance compiling C
code.
Upvotes: 4