mark Franz
mark Franz

Reputation: 115

Assembly instruction for writing an ASCII String

im currently writing an assembly routine and i don't understand something.

i have to write for example the string "000-00" in a file .txt, the routine is here (the routine read in the stack the pointer of the output file and write the string on it)(the routine is called from a c program):

.data
.section .text
.global func

func:
      pushl %ebp
      movl %esp, %ebp

      pushl %edi
      pushl %esi

      movl 12(%ebp), %edi
      movl $VAL, (%edi)
      movl $VAL2, 4(%edi)

      popl %esi
      popl %edi
      popl %ebp
      ret

in the line:

movl $VAL, (%edi) 

VAL must be the value which allow me to write on (%edi) the string 000-00. What i have to do? Convert the string in ASCII? I know that '0' in ASCII is 48, i thought i had to write $48484845 for(48-->'0' and 45-->'-')

so i thought that the string 000- was equal to 48484845 in ASCII. What am i doing wrong? WHy if i write:

movl $48, (%edi) 

the output is correct and in the file .txt i read the charachter 0, i simply dont undestand how to write multiple ascii charachters in one instruction.

Thx for the answers and sry for my bad English.

EDIT: I found this value:

movl $757935149, (%edi)

and this instruction will write the string "-,--" i dont undestant, the ASCII for the "-" is 45 and the "," is 44...

Upvotes: 0

Views: 2757

Answers (1)

Ped7g
Ped7g

Reputation: 16596

VAL must be the value which allow me to write on (%edi) the string 000-00. What i have to do? Convert the string in ASCII? I know that '0' in ASCII is 48, i thought i had to write $48484845 for(48-->'0' and 45-->'-')

so i thought that the string 000- was equal to 48484845 in ASCII. What am i doing wrong?

You are very close, but you are completely skipping the part how values are encoded in computer. the movl stores 32 bits into memory, and memory is addressable by bytes (8 bits), and characters encoded in ASCII are usually stored like one byte = one ASCII character (even if the pure ASCII requires only 7 bits, so the 8th bit is always zero).

So what you need is value VAL, which will store four bytes into memory: 48, 48, 48, 45.

But your decimal value 48484845 has two problems, one, when you will check how that value looks in binary, it's 0000_0010_1110_0011_1101_0001_1110_1101 (or in hexadecimal 0x2E3D1ED), and the x86 is little-endian system, so those 32 bits will be split and written into memory as bytes 1110_1101 (237 or 0xED), 1101_0001 (209 or 0xD1), 1110_0011 (227 or 0xE3) and 0000_0010 (2 or 0x02). The little-endian means, that the low 8 bits go into memory first (at address edi+0), and top 8 bits go into memory last (at address edi+3).

So you need to put the three "48" values into low 24 bits and value "45" into top 8 bits, i.e. you need value 48 + 48*256 + 48*256*256 + 45*256*256*256 = 758132784 which when you will convert it to hexadecimal is 0x2D303030. The *256 will "move" the value 8 bits up, because 28 = 256.

Now if you will pay attention to those hexa values, you may notice that every hexadecimal digit is built from exactly 4 bits (while decimal 0..9 digits are spread across 4 bits shared partially with previous/next digit, so they are not easy to compose/extract from binary), and with the 0x2D303030 value it's actually possible to read it in head and see the separate bytes as 2D_30_30_30, and because those will be stored into memory in little-endian order, the memory will be set to four bytes (in hexa): 30 30 30 2D, which when read as ASCII string will form "000-". So if you need to define some constant where you want to set particular bits, and you don't want to use calculator or write in source (48 + 48*256), you can often get away by defining it in hexadecimal format, like for example 0x8001 to set top and bottom bit of 16 bit value, and also each two hexa digits form exactly one byte (8 bits), so you can see in hexa formatting separate byte values of larger types (like word/dword/qword) (BTW, the 0x8001 is 32769 in decimal, which is something I can calculate in head, because I have the first 16 powers of two imprinted in memory since the 8 bit computers programming, so top bit is 215 = 32768 and bottom bit is 20 = 1 ... but the hexadecimal formatting is much more simpler and convenient for these kind of tasks).

BTW movl $48,(%edi) writes 4 bytes ("l" suffix on mov means "long" = 32 bit value), so you are writing four characters: 48, 0, 0, 0. Also your original task, if you should output really only "000-00" and nothing else, then the second mov with VAL2 should be movw $0x3030, 4(%edi) to write only two bytes into memory, with movl you will store value 0x3030 as four bytes 48, 48, 0, 0, i.e. adding two zeroed bytes after the string.

Then again if the caller routine did reserve enough space for the string buffer (8 bytes at least), and it writes only 6 bytes to the file on disk, the second movl is harmless. If the caller would have only 6 bytes buffer, then the second movl is buffer-overrun bug.

Upvotes: 2

Related Questions