Jeemong
Jeemong

Reputation: 11

Assembly(Intel x86) function to find the length of a string, why am I getting extra characters?

I am a beginner in assembly and I have this homework where I have to create a strlen function to find the length of any string.

I tried subtracting 4 from edx because I am seeing 4 extra characters at the end, but that did not fix anything. They are still there.

section .data   
text: db "Hello world, trying to find length of string using function."     ;our string to be outputted

section .text
global _start   ;declared for linker

_start:     
    mov eax, 4      ;system call number (sys write)
    mov ebx, 1      ;file descriptor to write-only
    mov ecx, text   ;message to output
    call strlen
    mov edx, len    ;length of string to print
    int 80h         ;interrupt

exit:       
    mov eax, 1  ;system call number (sys exit)
    mov ebx, 0  ;file descriptor to read-only
    int 80h     ;interrupt

strlen: 
    push ebp        ;prologue, save base pointer
    mov ebp, esp    ;copy esp to ebp
    push edi        ;push edi for use

                    ;body
    mov edi, text   ;save text to edi, and i think when i do that edi expands? if text = 5 bytes, and edi was originally 4, then edi becomes 5?
    sub edi, esp    ;subtract edi starting point by the esp starting point to get len. ex: edi = 100, esp = 95
    mov [len], edi  ;copy value of edi onto len

    pop edi         ;epilogue, pop edi out of stack
    mov esp, ebp    ;return esp back to top of stack
    pop ebp         ;pop ebp back to original
    ret             ;return address



section .bss    
len: resb 4 ;4 byte to integer

Let say I have the follow code in the .data section:

section .data   
text: db "Hello world, trying to find length of string using function."

The expected output should be "Hello world, trying to find length of string using function.", however I am getting "Hello world, trying to find length of string using function.####" where # is any random character.

This is the terminal output :

Thank you.

Upvotes: 1

Views: 7217

Answers (2)

PYigit
PYigit

Reputation: 170

The above answer correct but i want to add little bit shorter method to find length of string. Do not forget with this method rdi will change.

; repne = repeat until rcx = 0 or zf = 0
; scasb = compare the byte with rax 
;  and dec rcx and inc rdi

  mov rcx, -1   ; biggest number possible
  xor eax, eax  ; AL = 0
  repne scasb
  not rcx   ; rcx = -rcx - 1
  dec rcx   ; dec one because rcx was initialized with -1

Unlike rep movsb / rep stosb, current CPUs still only check 1 byte at a time (https://agner.org/optimize), and microcode startup overhead means it can be slower. Speed for large inputs should be similar to the loop in Sep's answer, but this probably can't suffer from branch misprediction, or benefit from correct prediction.

Upvotes: 0

Sep Roland
Sep Roland

Reputation: 39166

Prior to calling strlen, you've loaded ECX with the address of the string for which you desire to know the length. Then use ECX in your function directly.
You don't need to use the prolog/epilog code on this little task.

strlen: push    ecx
        dec     ecx
.loop:  inc     ecx
        cmp     byte ptr [ecx], 0
        jne     .loop
        sub     ecx, [esp]
        mov     [len], ecx         ; Save length
        pop     ecx
        ret

This code runs through the string until it finds a zero. At that point the starting address (It's on the stack at [esp]) is subtracted from the address where the zero was found (It's in ECX). This produces the length.

Instead of putting the result in a memory variable, you could choose to return it in the EDX register - ready to use next!

This version of strlen can only work if you make sure that the string is actually zero-terminated. Just append the zero.

section .data   
text: db "Hello world, trying to find length of string using function.",0

This is NASM

call strlen
mov edx, len    ;length of string to print
int 80h         ;interrupt

You need the square brackets around len in order to fetch the length that is stored at that location.

call    strlen
mov     edx, [len]    ; Length of string to print
int     80h

Upvotes: 2

Related Questions