Reputation: 164
I took some courses were MIPS and x86 assembly were taught.
For MIPS we used the MARS simulator.
For x86 we wrote code to templated .S files which they basically ignored the details of, so we basically wrote the body of subroutines...
Both shared some keywords:
Other we did not use in MARS:
Now I need to learn by my self ARM and I don't know really how to start coding.
What I understand is that in MARS we wrote code that was compiled to ML and sent direcly to the simulated CPU's memmory as if it were a Microcontroller (which by the way I did with a Microchip PIC). With x86 however we run through a host OS, which is Linux, and needed the compiler to add "header" and assign VM to the program thus not trivial.
Probably the the fact is that they taught us ASM more like a MACRO to ML and not as a language that had directives and features, which are low level, that needs to be compiled.
Googling I found GAS which might be what we did with x86...
So the questions are:
Upvotes: 1
Views: 105
Reputation: 71566
You don't need to make it that complicated
int main ( void )
{
return(555);
}
gcc -O2 -c so.c -o so.o
objdump -D so.o
0000000000000000 <main>:
0: b8 2b 02 00 00 mov $0x22b,%eax
5: c3 retq
Or you could look at the assembly generated, I prefer to disassemble, so now I can
.globl main
main:
mov $0x22b,%eax
retq
as so.s -o so.o
gcc so.o -o so
./so
And of course nothing comes out but
so.s
.globl fun
fun:
mov $0x22b,%eax
retq
so.c
#include <stdio.h>
int fun ( void );
int main ( void )
{
printf("%u\n",fun());
return(0);
}
as so.s -o fun.o
gcc so.c fun.o -o so
./so
555
And of course you can then complicate it as much as you like beyond that.
gcc outputs gnu assembler so
int fun ( void )
{
return(333);
}
gcc -O2 -save-temps -c so.c -o so.o
cat so.s
.file "so.c"
.section .text.unlikely,"ax",@progbits
.LCOLDB0:
.text
.LHOTB0:
.p2align 4,,15
.globl fun
.type fun, @function
fun:
.LFB0:
.cfi_startproc
movl $333, %eax
ret
.cfi_endproc
.LFE0:
.size fun, .-fun
.section .text.unlikely
.LCOLDE0:
.text
.LHOTE0:
.ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609"
.section .note.GNU-stack,"",@progbits
And although they often have an excess of directives (useful for debuggers and other things but not always used/required), you can use this fact to help to some extent to learn gnu assembler for this target (x86-64), but you of course need the documentation from the processor vendor (Intel in this case). Understanding that the syntax in that document is not necessarily the syntax used by any particular toolchain that you have or will use, you have to be multi-lingual there but you see what the instructions are and what they do and their limits, etc.
MARS and other similar environments are quite useful for teaching and are often designed for that reason leaving out a lot of the traps that you can fall into. The goal being to learn the instruction set by playing with a simulator and get your feet wet in assembly language. I am not a fan of an assembly interface, for educational purposes I think the student should generate/see the machine code, and perhaps within that sim you can, I have only used it for SO questions, I use real or simulated MIPS processors if I want to play with MIPS.
Assembly language is specific to the tool not the target, assume that each assembler for any target has its own assembly language and if there happens to be overlap then so be it.
global fun
fun:
mov eax, 333
ret
nasm so.s -felf64 -o so.o
gcc so.c so.o -o so
./so
333
There is the well known Intel vs AT&T thing but those are not syntaxes those are source destination swapping from the Intel standard. nasm doesn't like .globl, try it it likes global without the dot.
.globl fun
fun:
movl %eax, $333
ret
so.s:1: error: attempt to define a local label before any non-local labels
so.s:1: error: parser: instruction expected
so.s:3: error: parser: instruction expected
globl fun
fun:
movl %eax, $333
ret
nasm so.s -felf64 -o so.o
so.s:1: error: parser: instruction expected
so.s:3: error: parser: instruction expected
globl fun <-- note this is line 1
fun:
mov %eax, $333 <--- this is line 3
ret
nasm so.s -felf64 -o so.o
so.s:1: error: parser: instruction expected
so.s:3: error: expression syntax error
globl fun
fun:
mov eax, 333
ret
nasm so.s -felf64 -o so.o
so.s:1: error: parser: instruction expected
global fun
fun:
mov eax, 333
ret
And nasm is happy
as so.s -o so.o
so.s: Assembler messages:
so.s:1: Error: no such instruction: `global fun'
so.s:3: Error: too many memory references for `mov'
.global fun
fun:
mov 333, eax
ret
so.s: Assembler messages:
so.s:3: Error: too many memory references for `mov'
.global fun
fun:
mov $333, eax
ret
so.s: Assembler messages:
so.s:3: Error: no instruction mnemonic suffix given and no register operands; can't size instruction
.global fun
fun:
movl $333, eax
ret
and as is happy BUT, this is broken it thinks eax is a label to be filled in later
0000000000000000 <fun>:
0: c7 04 25 00 00 00 00 movl $0x14d,0x0
7: 4d 01 00 00
b: c3 retq
.global fun
fun:
movl $333, %eax
ret
0000000000000000 <fun>:
0: b8 4d 01 00 00 mov $0x14d,%eax
5: c3 retq
.global fun
fun:
movl $333, %eax
retq
0000000000000000 <fun>:
0: b8 4d 01 00 00 mov $0x14d,%eax
5: c3 retq
.global fun
fun:
mov $333, %eax
retq
0000000000000000 <fun>:
0: b8 4d 01 00 00 mov $0x14d,%eax
5: c3 retq
nasm:
global fun
fun:
mov eax, 333
ret
0000000000000000 <fun>:
0: b8 4d 01 00 00 mov $0x14d,%eax
5: c3 retq
Same machine code, different assembly language in more ways than just reversing the source and destination (I used objdump to disassemble so that is why you see that syntax).
gas takes .globl or .global. Since the size of the mov is obvious due to the eax register which is 32 bits the suffix isn't needed movl or mov apparently work with the binutils I have. Likewise ret vs retq produced the same instruction.
The joys of assembly language especially with a painful target like x86 (the last instruction set you want to learn there is a list of more useful/better ones).
But you can see that assembly language can/does differ for the same target the same instructions based on the tool used. And something like MARS starts to make even more sense for that use case.
You won't go wrong learning the gcc/binutils (gnu) tools as you can use them on Windows, Mac, Linux, BSD, etc and all but the system calls and possibly binary file formats are going to be the same experience (okay linker scripts, OS specific stuff will differ).
Depending on the target there may be other good choices too. nasm is popular for the folks that learned Intel syntax from the old days and I suppose others, as well as code that may have been laying about for a while that gas pukes on you might have half a chance with nasm.
And one or the other or both have command line options for the Intel vs ATT source/destination swapping.
Upvotes: 2