Reputation: 3583
Given the following code:
L1 db "word", 0
mov al, [L1]
mov eax, L1
What do the brackets in [L1]
represent?
This question is specifically about NASM. The other major flavour of Intel-syntax assembly is MASM style, where brackets work differently when there's no register involved:
See Confusing brackets in MASM32
Upvotes: 75
Views: 67141
Reputation: 11
In MASM, brackets work like NASM when used with registers, and in that case are not optional. (Things are different for addressing modes that don't involve a register, see Confusing brackets in MASM32)
The brackets indicate that the register contains a pointer, and that the machine code wants the value of that pointer (pointers are in byte-addressing: a pointer is the xth byte of memory; a byte is 8 binary digits; one hexadecimal digit is 4 binary digits; as a byte is 2 hexadecimal digits; starting from there); if it's in the src part of the instruction.
In binary machine code, (typing hexadecimal digits in notepad.exe then converting hexadecimal digits into \xhexadecimal result~python_reference) to get the value of a pointer in a register, it can be defined in the ModR/M byte of the instruction that's going to be written in notepad.exe which is 10 characters I believe. (I'm finishing my MASM experience first, then I'm going to move on to scavenge information about what to type into notepad.exe through readings of window's kernel/malware analysis; I'll come back to this post and write up an example)
1 .686
2 .model flat, c
3 option casemap :none
4
5 include C:\masm32\include\kernel32.inc
6 includelib C:\masm32\lib\kernel32.lib
7
8 .data
9 message db "Hello world!", 0
10 .code
11
12 main proc
13 call testfunc
14 COMMENT @
15 push 0FFFFh
16 push testfunc
17 pop ax
18 @
19 invoke ExitProcess, 404
20 main ENDP
21
22 testfunc proc
23 sub esp, 1
24 mov al, 0FFh
25 mov [esp], al
26 COMMENT @
27 push 0FFFFh
28 push 05EFFB880h
29 push 0773BFF5Ch
30 push 0FB038Fh
31 mov al, [esp+8]
32 @
33 invoke ExitProcess, [esp]
34 testfunc ENDP
35
36 END main
Windows:
If you would type the result of executing this, and compare:
C:\masm32\bin\ml /c /Zd /coff script_name.asm
C:\masm32\bin\Link /SUBSYSTEM:CONSOLE script_name.obj
script_name.exe
echo %ERRORLEVEL%
The program's exit status (printed with echo
) would be a the number stored to stack memory with mov [esp], al
as the arg to ExitProcess, ending in hex FF. (%ERRORLEVEL%
converts the number to a string of decimal digits, not hex, but it's the same number.)
However, without the [] around [esp]
: we also have to change AL to EAX (because x86 CPUs don't have an instruction to move 8-bit registers to bottom of 32-bit registers). And remove the brackets around the last time the letters 'esp' was used in the lines of code; it would result in the pointer to the stack region in esp.
1 testfunc proc
2 mov eax, esp
3 mov bl, 0FFh
4 mov [eax], bl
5 COMMENT @
6 push 0FFFFh
7 push 05EFFB880h
8 push 0773BFF5Ch
9 push 0FB038Fh
10 mov al, [esp+8]
11 @
12 invoke ExitProcess, [esp]
13 testfunc ENDP
Tag: optional brackets
The above code is proof that the brackets ALWAYS WORK (uses the value inside whatever the code is as a pointer and gets the value of the pointer) in language interpreting machine code into a readable way instead of bytes and knowing how the Windows kernel would execute an exe file (reverse engineer window's kernel to make your own exe files from scratch inside notepad, which there isn't enough support in; however, malware analysis does have enough support.)
(If you want to test the code: you replace the lines with the testfunc in last code, and execute it the same way with the lines): In this case, eax is equal to esp's pointer in memory of the stack segment (stack segment is important because it has its own instructions: PUSH and POP 32-bit values from / to an immediate, register, or memory operand). So when you execute it, the bare esp
operand is the value of the ESP register, a pointer value, not memory contents on the stack.
I'll come back and edit this post once in a while (if I actually get really good at assembly.); So, this can be an ultimate guide to assembly. I just got started in assembly and making a quick length of the most significant bit finder in a specific range script in assembly.
Resources that have helped me gotten to make this script so far:
5 hour tutorial of the entirety of C++:
Helps me find out what stuff like DWORDs are (unsigned long).
https://www.bing.com
I read till half of Volume 3 and then skimmed the rest
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
Davy Wybrial's assembly language tutorial to watch after all that of watching:
https://www.youtube.com/watch?v=wLXIWKUWpSs&ab_channel=DavyWybiral
The Intel Software Developer Manual's section called 'Operation Section':
How to Start Coding Assembly on Windows (MASM)
https://www.youtube.com/watch?v=lCjbwLeLNfs&ab_channel=CharlesClayton
Again, I'll come back to here (this post, and as well as my future posts) and try to educate everyone, so my knowledge is equal with everyone reading.
Upvotes: 1
Reputation: 33014
Operands of this type, such as [ebp]
, are called memory operands.
All the answers here are good, but I see that none tells about the caveat in following this as a rigid rule - if brackets, then dereference, except when it's the lea
instruction.
lea
is an exception to the above rule. Say we've
mov eax, [ebp - 4]
The value of ebp
is subtracted by 4 and the brackets indicate that the resulting value is taken as an address and the value residing at that address is stored in eax
. However, in lea
's case, the brackets wouldn't mean that:
lea eax, [ebp - 4]
The value of ebp
is subtracted by 4 and the resulting value is stored in eax
. This instruction would just calculate the address and store the calculated value in the destination register. See What is the difference between MOV and LEA? for further details.
Upvotes: 57
Reputation: 29186
The brackets mean to de-reference an address. For example
mov eax, [1234]
means, mov the contents of address 1234 to EAX. So:
1234 00001
EAX will contain 00001.
Upvotes: 12
Reputation: 101506
Direct memory addressing - al
will be loaded with the value located at memory address L1
.
Upvotes: 2
Reputation: 63935
Simply means to get the memory at the address marked by the label L1.
If you like C, then think of it like this: [L1]
is the same as *L1
Upvotes: 30
Reputation: 799560
It indicates that the register should be used as a pointer for the actual location, instead of acting upon the register itself.
Upvotes: 0
Reputation: 42942
They mean that instead of moving the value of the register or numeric value L1
into the register al
, treat the register value or numeric value L1
as a pointer into memory, fetch the contents of that memory address, and move that contents into al
.
In this instance, L1 is a memory location, but the same logic would apply if a register name was in the brackets:
mov al, [ebx]
Also known as a load.
Upvotes: 1
Reputation: 882766
As with many assembler languages, this means indirection. In other words, the first mov
loads al
with the contents of L1
(the byte 'w'
in other words), not the address.
Your second mov
actually loads eax
with the address L1
and you can later dereference that to get or set its content.
In both those cases, L1
is conceptually considered to be the address.
Upvotes: 1
Reputation: 110203
[L1]
means the memory contents at address L1. After running mov al, [L1]
here, The al
register will receive the byte at address L1 (the letter 'w').
Upvotes: 63