Brando Muñoz
Brando Muñoz

Reputation: 9

A better form to find a Substring in a string in assembly code (MASM)?

So i made this code with the knowledge i gather in various sites, i think there is an optimazed way to do it without pushing and poping the registers on the stack memory, but i don't know how to do it. Here is my Code

comparing proc
    MOV CX, SIZEOF vec2 ;The size of the substring
    DEC CX
    MOV DX, SIZEOF vec1 ; The size of the String
    LEA SI, vec1        ; The String
    LEA DI, vec2        ; The substring

FIND_FIRST:        
    MOV AL, [SI];   storing the ascii value of current character of mainstring 
    MOV AH, [DI];   storing the ascii value of current character of substring
    CMP AL,AH;      comparing both character
    JE FITTING;      if we find it we try to find the whole substring
    JNE NEXT

NEXT:
    INC SI; We go to the next char
    DEC DX; And the size of the string decreased
    JE N_FINDED
    JMP FIND_FIRST
FITTING:
    CLD
    PUSH CX ; I push this register because in the instruction REPE CMPSB
    PUSH SI ; They change.
    PUSH DI
    REPE CMPSB
    JNE N_FITTING
    JE FINDED
N_FITTING:
    POP DI
    POP SI
    POP CX
    JMP NEXT ; if the complete substring doesn't fit we go to the next char
FINDED:
    POP DI
    POP SI
    POP CX
    MOV AL, 0001H;  substring found
    JMP RETURN


N_FINDED:    
    MOV AL, 0000H;  substring not found

RETURN:
    ret 
comparing endp

Upvotes: -1

Views: 747

Answers (1)

Sep Roland
Sep Roland

Reputation: 39691

If the substring happens to have more than one character, which is very likely, then your code will start comparing bytes that are beyond the string to search through.
With the string referred to by DI and its length DX, and the substring referred to by SI and its length CX, you first need to make sure that neither string is empty, and then you need to limit the number of possible finds. Next 4 lines of code do just that:

    jcxz NotFound   ; Substring is empty
    sub  dx, cx
    jb   NotFound   ; Substring is longer than String
                    ; Happens also when String is empty
    inc  dx

As an example consider the string "overflow" (DX=8) and the substring "basic" (CX=5):

sub  dx, cx    ; 8 - 5 = 3
inc  dx        ; 3 + 1 = 4  Number of possible finds is 4

overflow

basic       possible find number 1
 basic      possible find number 2
  basic     possible find number 3
   basic    possible find number 4

You can write your proc without having to preserve those registers on the stack (or elsewhere) all the time. Just introduce another register so you don't have to clobber the CX, SI, and DI registers:

    jcxz NotFound
    sub  dx, cx
    jb   NotFound
    inc  dx

    mov  al, [si]       ; Permanent load of first char of the Substring
FirstCharLoop:
    cmp  [di], al
    je   FirstCharMatch
NextFirstChar:
    inc  di
    dec  dx             ; More tries?
    jnz  FirstCharLoop  ; Yes
NotFound:
    xor  ax, ax
    ret

FirstCharMatch:
    mov  bx, cx
    dec  bx
    jz   Found          ; Substring had only 1 character
OtherCharsLoop:
    mov  ah, [si+bx]
    cmp  [di+bx], ah
    jne  NextFirstChar
    dec  bx
    jnz  OtherCharsLoop
Found:
    mov  ax, 1
    ret

Do note that this code now does not compare the first character again like the repe cmpsb in your program did.
AX being the result, the only registers that are clobbered (and that you maybe could want to preserve) are BX, DX, and DI.

Upvotes: 1

Related Questions