Reputation: 115
im currently writing an assembly routine and i don't understand something.
i have to write for example the string "000-00" in a file .txt, the routine is here (the routine read in the stack the pointer of the output file and write the string on it)(the routine is called from a c program):
.data
.section .text
.global func
func:
pushl %ebp
movl %esp, %ebp
pushl %edi
pushl %esi
movl 12(%ebp), %edi
movl $VAL, (%edi)
movl $VAL2, 4(%edi)
popl %esi
popl %edi
popl %ebp
ret
in the line:
movl $VAL, (%edi)
VAL must be the value which allow me to write on (%edi) the string 000-00. What i have to do? Convert the string in ASCII? I know that '0' in ASCII is 48, i thought i had to write $48484845 for(48-->'0' and 45-->'-')
so i thought that the string 000- was equal to 48484845 in ASCII. What am i doing wrong? WHy if i write:
movl $48, (%edi)
the output is correct and in the file .txt i read the charachter 0, i simply dont undestand how to write multiple ascii charachters in one instruction.
Thx for the answers and sry for my bad English.
EDIT: I found this value:
movl $757935149, (%edi)
and this instruction will write the string "-,--" i dont undestant, the ASCII for the "-" is 45 and the "," is 44...
Upvotes: 0
Views: 2757
Reputation: 16596
VAL must be the value which allow me to write on (%edi) the string 000-00. What i have to do? Convert the string in ASCII? I know that '0' in ASCII is 48, i thought i had to write $48484845 for(48-->'0' and 45-->'-')
so i thought that the string 000- was equal to 48484845 in ASCII. What am i doing wrong?
You are very close, but you are completely skipping the part how values are encoded in computer. the movl
stores 32 bits into memory, and memory is addressable by bytes (8 bits), and characters encoded in ASCII are usually stored like one byte = one ASCII character (even if the pure ASCII requires only 7 bits, so the 8th bit is always zero).
So what you need is value VAL
, which will store four bytes into memory: 48, 48, 48, 45.
But your decimal value 48484845 has two problems, one, when you will check how that value looks in binary, it's 0000_0010_1110_0011_1101_0001_1110_1101
(or in hexadecimal 0x2E3D1ED
), and the x86 is little-endian system, so those 32 bits will be split and written into memory as bytes 1110_1101
(237 or 0xED), 1101_0001
(209 or 0xD1), 1110_0011
(227 or 0xE3) and 0000_0010
(2 or 0x02). The little-endian means, that the low 8 bits go into memory first (at address edi+0
), and top 8 bits go into memory last (at address edi+3
).
So you need to put the three "48" values into low 24 bits and value "45" into top 8 bits, i.e. you need value 48 + 48*256 + 48*256*256 + 45*256*256*256 = 758132784
which when you will convert it to hexadecimal is 0x2D303030
. The *256
will "move" the value 8 bits up, because 28 = 256.
Now if you will pay attention to those hexa values, you may notice that every hexadecimal digit is built from exactly 4 bits (while decimal 0..9 digits are spread across 4 bits shared partially with previous/next digit, so they are not easy to compose/extract from binary), and with the 0x2D303030
value it's actually possible to read it in head and see the separate bytes as 2D_30_30_30, and because those will be stored into memory in little-endian order, the memory will be set to four bytes (in hexa): 30 30 30 2D
, which when read as ASCII string will form "000-"
. So if you need to define some constant where you want to set particular bits, and you don't want to use calculator or write in source (48 + 48*256), you can often get away by defining it in hexadecimal format, like for example 0x8001
to set top and bottom bit of 16 bit value, and also each two hexa digits form exactly one byte (8 bits), so you can see in hexa formatting separate byte values of larger types (like word/dword/qword) (BTW, the 0x8001 is 32769 in decimal, which is something I can calculate in head, because I have the first 16 powers of two imprinted in memory since the 8 bit computers programming, so top bit is 215 = 32768 and bottom bit is 20 = 1 ... but the hexadecimal formatting is much more simpler and convenient for these kind of tasks).
BTW movl $48,(%edi)
writes 4 bytes ("l" suffix on mov
means "long" = 32 bit value), so you are writing four characters: 48, 0, 0, 0
. Also your original task, if you should output really only "000-00" and nothing else, then the second mov
with VAL2
should be movw $0x3030, 4(%edi)
to write only two bytes into memory, with movl
you will store value 0x3030
as four bytes 48, 48, 0, 0
, i.e. adding two zeroed bytes after the string.
Then again if the caller routine did reserve enough space for the string buffer (8 bytes at least), and it writes only 6 bytes to the file on disk, the second movl
is harmless. If the caller would have only 6 bytes buffer, then the second movl
is buffer-overrun bug.
Upvotes: 2