Eames
Eames

Reputation: 319

NASM modifying a string

I'm new to using Assembly (all types), so I was following tutorials from tutorialspoint.com In particular, I was on the page https://www.tutorialspoint.com/assembly_programming/assembly_addressing_modes.htm which is about Assembly Addressing. Everything was working, up until the final example where it gave the code (takes the name Zara Ali and changes it to Nuha Ali):

section .text
   global _start     ;must be declared for linker (ld)
_start:             ;tell linker entry point

   ;writing the name 'Zara Ali'
   mov  edx,9       ;message length
   mov  ecx, name   ;message to write
   mov  ebx,1       ;file descriptor (stdout)
   mov  eax,4       ;system call number (sys_write)
   int  0x80        ;call kernel

   mov  [name],  dword 'Nuha'    ; Changed the name to Nuha Ali

   ;writing the name 'Nuha Ali'
   mov  edx,8       ;message length
   mov  ecx,name    ;message to write
   mov  ebx,1       ;file descriptor (stdout)
   mov  eax,4       ;system call number (sys_write)
   int  0x80        ;call kernel

   mov  eax,1       ;system call number (sys_exit)
   int  0x80        ;call kernel

section .data
name db 'Zara Ali', 0xa

This code works of course, however, when modifying it slightly, I ran into problems. In line 13, I changed 'Nuha' to 'Nuhas' just to see if it would come up with 'NuhasAli' (since I'm assuming that that line is just replacing whatever bits are there with Nuhas and keeping the rest (Ali)).

When I tried this and ran the command 'nasm -f elf helloasm.asm' (helloasm.asm is the name of the file), it gave me these two messages:

helloasm.asm:13: warning: character constant too long
helloasm.asm:13: warning: dword data exceeds bounds

I couldn't find any insight into the first problem as when I looked it up, all it gave me were results regarding C and C++. However, as for the second warning, I tried to make the dword data stop exceeding the bounds by simply making it a qword instead of a dword

section .text
   global _start     ;must be declared for linker (ld)

_start:             ;tell linker entry point

   ;writing the name 'Zara Ali'
   mov  edx,9       ;message length
   mov  ecx, name   ;message to write
   mov  ebx,1       ;file descriptor (stdout)
   mov  eax,4       ;system call number (sys_write)
   int  0x80        ;call kernel

   mov [name], qword 'Nuhas'    ; Changed the name to Nuha Ali

   ;writing the name 'Nuha Ali'
   mov  edx,10      ;message length
   mov  ecx,name    ;message to write
   mov  ebx,1       ;file descriptor (stdout)
   mov  eax,4       ;system call number (sys_write)
   int  0x80        ;call kernel

   mov  eax,1       ;system call number (sys_exit)
   int  0x80        ;call kernel

section .data
    name dw 'Zara Ali', 0xa
helloasm.asm:13: warning: character constant too long
helloasm.asm:13: error: operation size not specified

At this point, I was stumped. Can anyone offer any insight into why this is happening and how I should fix it? Are my fundamentals wrong? Thanks in advance for any help

Upvotes: 0

Views: 2502

Answers (1)

Mike Nakis
Mike Nakis

Reputation: 61993

mov [name], dword 'Nuhas' fails because a DWORD is four bytes long, but Nuhas is five bytes long. There is nothing strange about that, the assembler is telling you precisely what is wrong, and it even says it in two different ways: "character constant too long" and "dword data exceeds bounds". You just can't fit 5 bytes into a place that can only hold 4 bytes.

mov [name], qword 'Nuhas' fails apparently because name is a DWORD, not a QWORD, and if you somehow managed to shoehorn 8 bytes into it you would probably be corrupting the memory immediately following it.

EDIT

I guess the confusion stems from the fact that the sample code you found has name dw 'Zara Ali', 0xa, which is perhaps making you believe that you can fit that entire text in a DWORD. I don't know why they used dw, it seems to me that they should have used db. The only reason why they were able to do that is because the assembler allows this kind of hack, but in order for their sample code to be understandable and intuitive, they should have used db, because that's what you are declaring there: a sequence of bytes.

In any case, the dw directive has nothing to do with the mov [name], dword 'wxyz' instruction. The dw directive is declaring an arbitrary-length sequence of DWORDs. The mov instruction can copy a fixed amount of bytes: either a single byte, or two bytes (a WORD) or four bytes (a DWORD). (Or perhaps even eight bytes, a QWORD, if there are any instructions that can somehow specify a quad-word operand, though I cannot think of any right now.) The assembler probably performs some rudimentary type checking, making sure that the target quantity is of a type suitable to receive the source quantity, and the check passes, since name has been defined with dw, and the quantity being stored is given as a DWORD.

So, the mov [name], dword 'wxyz' instruction is actually a hack: it packs the four characters you gave it into a DWORD, and stores that in [name]. name has been defined with dw, so it is of the correct type, so the hack works. However, only four characters can be packed in a DWORD, that's why 'Nuhas' does not work. Also, you cannot do name dq 'bla bla bla' and expect mov [name], qword 'wxyzwxyz' to work, because there is no instruction that will move an immediate qword, since the instruction set that you are working with is 32-bit. (You could do that with a 64-bit instruction set.)

Upvotes: 3

Related Questions