Reputation: 3583

What do the brackets mean in NASM syntax for x86 asm?

Given the following code:

L1     db    "word", 0

       mov   al, [L1]
       mov   eax, L1

What do the brackets in [L1] represent?

This question is specifically about NASM. The other major flavour of Intel-syntax assembly is MASM style, where brackets work differently when there's no register involved:
See Confusing brackets in MASM32

Upvotes: 75

Answers (9)

kleirceval

Reputation: 11

In MASM, brackets work like NASM when used with registers, and in that case are not optional. (Things are different for addressing modes that don't involve a register, see Confusing brackets in MASM32)

The brackets indicate that the register contains a pointer, and that the machine code wants the value of that pointer (pointers are in byte-addressing: a pointer is the xth byte of memory; a byte is 8 binary digits; one hexadecimal digit is 4 binary digits; as a byte is 2 hexadecimal digits; starting from there); if it's in the src part of the instruction.

However, if dst has the brackets: memory at that address is an operand for the instruction. (Memory as in pointer of "byte-addressing" talked about, previously.)

In binary machine code, (typing hexadecimal digits in notepad.exe then converting hexadecimal digits into \xhexadecimal result~python_reference) to get the value of a pointer in a register, it can be defined in the ModR/M byte of the instruction that's going to be written in notepad.exe which is 10 characters I believe. (I'm finishing my MASM experience first, then I'm going to move on to scavenge information about what to type into notepad.exe through readings of window's kernel/malware analysis; I'll come back to this post and write up an example)

1 .686
2 .model flat, c
3 option casemap :none
4 
5 include C:\masm32\include\kernel32.inc
6 includelib C:\masm32\lib\kernel32.lib
7 
8 .data 
9     message db "Hello world!", 0
10 .code
11 
12 main proc
13  call testfunc
14  COMMENT @ 
15  push 0FFFFh
16  push testfunc
17  pop ax
18  @
19  invoke ExitProcess, 404
20 main ENDP
21 
22 testfunc proc
23  sub esp, 1
24  mov al, 0FFh
25  mov [esp], al
26  COMMENT @
27  push 0FFFFh
28  push 05EFFB880h
29  push 0773BFF5Ch
30  push 0FB038Fh
31  mov al, [esp+8]
32  @
33  invoke ExitProcess, [esp]
34 testfunc ENDP
35 
36 END main

Windows:
If you would type the result of executing this, and compare:

C:\masm32\bin\ml /c /Zd /coff script_name.asm
C:\masm32\bin\Link /SUBSYSTEM:CONSOLE script_name.obj
script_name.exe
echo %ERRORLEVEL%

The program's exit status (printed with echo) would be a the number stored to stack memory with mov [esp], al as the arg to ExitProcess, ending in hex FF. (%ERRORLEVEL% converts the number to a string of decimal digits, not hex, but it's the same number.)

However, without the [] around [esp]: we also have to change AL to EAX (because x86 CPUs don't have an instruction to move 8-bit registers to bottom of 32-bit registers). And remove the brackets around the last time the letters 'esp' was used in the lines of code; it would result in the pointer to the stack region in esp.

1 testfunc proc
2   mov eax, esp
3   mov bl, 0FFh
4   mov [eax], bl
5   COMMENT @
6   push 0FFFFh
7   push 05EFFB880h
8   push 0773BFF5Ch
9   push 0FB038Fh
10  mov al, [esp+8]
11  @
12  invoke ExitProcess, [esp]
13 testfunc ENDP

Tag: optional brackets

The above code is proof that the brackets ALWAYS WORK (uses the value inside whatever the code is as a pointer and gets the value of the pointer) in language interpreting machine code into a readable way instead of bytes and knowing how the Windows kernel would execute an exe file (reverse engineer window's kernel to make your own exe files from scratch inside notepad, which there isn't enough support in; however, malware analysis does have enough support.)

(If you want to test the code: you replace the lines with the testfunc in last code, and execute it the same way with the lines): In this case, eax is equal to esp's pointer in memory of the stack segment (stack segment is important because it has its own instructions: PUSH and POP 32-bit values from / to an immediate, register, or memory operand). So when you execute it, the bare esp operand is the value of the ESP register, a pointer value, not memory contents on the stack.

I'll come back and edit this post once in a while (if I actually get really good at assembly.); So, this can be an ultimate guide to assembly. I just got started in assembly and making a quick length of the most significant bit finder in a specific range script in assembly.

Resources that have helped me gotten to make this script so far:
5 hour tutorial of the entirety of C++:

https://www.youtube.com/watch?v=vLnPwxZdW4Y&ab_channel=freeCodeCamp.org

I recommend after this doing a scavenger hunt of learning HTML/CSS/JS and making a calculator website (a drag and drop of html file to Microsoft Edge), and scavenger hunt of coding a video game like Undertale (a drag and drop of html file to Microsoft Edge), and then learn Python3 just for jokes.

Helps me find out what stuff like DWORDs are (unsigned long).
https://www.bing.com

Please read the intel software developer manual, it tells you stuff like how if you change a position in memory, it's called the command register of advanced programmable interrupt controller would execute code in another core which is a CPU. You don't have to remember, just I recommend rewriting everything into txts, and then make a script to search for a word every new section you create a txt. I didn't even memorize anything from the book, I just know some stuff in the commonsense part of my mind, I hope you will know more for the reader.

I read till half of Volume 3 and then skimmed the rest
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

I watched some of https://www.youtube.com/c/WhatsACreel videos because I was doing a chapter and had 30 day breaks between reading that so I could understand better. I recommend doing that too, but I don't know how to tell you when to stop and question your thinking to watch a video; I'm sorry.

Davy Wybrial's assembly language tutorial to watch after all that of watching: https://www.youtube.com/watch?v=wLXIWKUWpSs&ab_channel=DavyWybiral
The Intel Software Developer Manual's section called 'Operation Section':

"a register name enclosed in brackets implies the contents of the location whose address is contained in that register."

How to Start Coding Assembly on Windows (MASM)
https://www.youtube.com/watch?v=lCjbwLeLNfs&ab_channel=CharlesClayton

Again, I'll come back to here (this post, and as well as my future posts) and try to educate everyone, so my knowledge is equal with everyone reading.

Upvotes: 1

legends2k

Reputation: 33014

Operands of this type, such as [ebp], are called memory operands.

All the answers here are good, but I see that none tells about the caveat in following this as a rigid rule - if brackets, then dereference, except when it's the lea instruction.

lea is an exception to the above rule. Say we've

mov eax, [ebp - 4]

The value of ebp is subtracted by 4 and the brackets indicate that the resulting value is taken as an address and the value residing at that address is stored in eax. However, in lea's case, the brackets wouldn't mean that:

lea eax, [ebp - 4]

The value of ebp is subtracted by 4 and the resulting value is stored in eax. This instruction would just calculate the address and store the calculated value in the destination register. See What is the difference between MOV and LEA? for further details.

Upvotes: 57

Jason Evans

Reputation: 29186

The brackets mean to de-reference an address. For example

mov eax, [1234]

means, mov the contents of address 1234 to EAX. So:

1234 00001

EAX will contain 00001.

Upvotes: 12

John Dibling

Reputation: 101506

Direct memory addressing - al will be loaded with the value located at memory address L1.

Upvotes: 2

Earlz

Reputation: 63935

Simply means to get the memory at the address marked by the label L1.

If you like C, then think of it like this: [L1] is the same as *L1

Upvotes: 30

Ignacio Vazquez-Abrams

Reputation: 799560

It indicates that the register should be used as a pointer for the actual location, instead of acting upon the register itself.

Upvotes: 0

Alex Brown

Reputation: 42942

They mean that instead of moving the value of the register or numeric value L1 into the register al, treat the register value or numeric value L1 as a pointer into memory, fetch the contents of that memory address, and move that contents into al.

In this instance, L1 is a memory location, but the same logic would apply if a register name was in the brackets:

mov al, [ebx]

Also known as a load.

Upvotes: 1

paxdiablo

Reputation: 882766

As with many assembler languages, this means indirection. In other words, the first mov loads al with the contents of L1 (the byte 'w' in other words), not the address.

Your second mov actually loads eax with the address L1 and you can later dereference that to get or set its content.

In both those cases, L1 is conceptually considered to be the address.

Upvotes: 1

interjay

Reputation: 110203

[L1] means the memory contents at address L1. After running mov al, [L1] here, The al register will receive the byte at address L1 (the letter 'w').

Upvotes: 63

What do the brackets mean in NASM syntax for x86 asm?

Answers (9)

Related Questions