Reputation: 165
I'm having trouble with figuring out how to determine if a value is a number or letter in MASM assembly language. This program should go through and array and display the first number found in an array and print it along with the index it was found at. I'm using the Irvine32.inc library which contains IsDigit
but for some reason it isn't working and I don't know why.
Here's the code:
TITLE Number Finder
INCLUDE Irvine32.inc
.data
AlphaNumeric SDWORD 'A', 'p', 'Q', 'M', 67d, -3d, 74d, 'G', 'W', 92d
Alphabetical DWORD 'A', 'B', 'C', 'D', 'E'
Numeric DWORD 0, 1, 2, 3, 4, 5, 6
index DWORD ?
valueFound BYTE "number found: ", 0
atIndex BYTE "at index: ", 0
noValueFound BYTE "no numeric found", 0
spacing BYTE ", ", 0
;DOESNT WORK CORRECTLY
;SKIPS the value 67
.code
main PROC
mov esi, OFFSET AlphaNumeric ;point to start of array
mov ecx, LENGTHOF AlphaNumeric ;set loop counter
mov index, 0
mov eax, 0 ; clear eax
L1: mov al, [esi]
call IsDigit ; ZF = 1 -> valid digit , ZF = 0 -> not a valid digit
;jmp if digit
jz NUMBER_FOUND
;jmp if char
jnz CHARACTER
;this probably never gets reached
inc index
add esi, TYPE AlphaNumeric
loop L1
;if loop finishes without finding a number
jmp NUMBER_NOT_FOUND
;next iteration of loop if val is a char
CHARACTER:
add esi, TYPE AlphaNumeric
add index, 1
loop L1
NUMBER_FOUND:
mov edx, OFFSET valueFound
call WriteString ; prints "number found"
mov eax, [esi]
call WriteInt ; prints the number found
mov edx, OFFSET spacing
call WriteString
mov edx, OFFSET atIndex
call WriteString ; prints "at index: "
mov eax, index
call WriteDec ; prints the index value
;jmp to NEXT to skip NUMBER_NOT_FOUND block
jmp NEXT
NUMBER_NOT_FOUND:
mov edx, OFFSET noValueFound
call WriteString
NEXT:
exit
main ENDP
END main
When I debug it, when it gets the the loop iteration where it processes the value 67d it load 43 into al which is its hex representation but since 43h lines up with the ASCII value 'C' is assuming that call IsDigit
processes this as a letter and not a number. It also skips all numbers and will print "Number found: +65, at index: 10" which shouldn't even happen. Is there an operation I can use to convert the hex value to the decimal value for the IsDigit
call to work correctly? So if someone could please explain a way to evaluate if a value in an array is either a number or letter, capital and lowercase, that would be very much appreciated.
Upvotes: 0
Views: 1583
Reputation: 364428
This is an impossible task. The most you can do is check for numbers that aren't the ASCII code for an alphabetic character (https://asciitable.com/), which is what your code does. Index 5 is the first byte where that's the case.
67
(decimal) is the same byte value as 'C'
. Once it's assembled into binary bytes in your .data section, they're the same single byte. Thus there's no way you can tell how it was written in the source; db 67, 'C'
is the same pair of bytes as db 'C', 67
. It's a number that's in the range of upper-case ASCII codes. Another equivalent way to write the same value in the source is 43h
.
Bytes don't have types associated with them, just the 8-bit bit-pattern which represents a value. Different interpretations of the same bits could be different values, e.g. -3
(signed) and 253
(unsigned) are both represented by the bit-pattern 0b11111101
which is 0xfd
. All of those are valid ways of writing the value that gets loaded into AL by your program. Numbers in a computer are binary; hex and decimal are just convenient formats for humans, so debuggers convert binary values into strings of ASCII digits for display.
As a character value, it also represents a font glyph in some 8-bit character sets.
If your program doesn't keep track of types separately, that info is not recoverable.
Normally you write programs to know that a whole array holds 8-bit numbers, or holds ASCII codes, just like in C you have functions that take int8_t*
or char*
, even though those are the same actual type, they have different semantic meaning for human programmers. Or another example would be int*
vs. char*
; you certainly could look at the bytes of an int
array as character data (with many of the characters being '\0'
or '\xff'
for small positive / negative integer values), but you don't try to figure it out by looking at the byte values. Higher-level languages like Python and Perl store a type along with each object, like a struct { enum type; union { stuff }; }
, with many types like a string including a pointer.
Re: implementing an IsAlpha
function: See What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa? - it only takes a few instructions.
;; input in DL, unmodified
IsAlpha:
mov eax, edx
or al, 0x20 ; force to lower case if it wasn't already
sub al, 'a'
cmp al, 25 ; 'z'-'a' = index of the last letter in the alphabet
; setbe al ; for a boolean 0/1 return value in AL
ret
;; return in FLAGS: ja non_alpha or jbe alphabetic
Upvotes: 1