goal4321
goal4321

Reputation: 121

Accessing address of a string in inline assembly in gcc

I have written the below assembly code to convert a string from lower case to uppercase, It's not completely working because i'm not able to access the address of a string that i'm converting. this code is not working why?

  #include<stdio.h>
  int convert(char *str)
  {
       char *ptr;
  __asm__ __volatile__ ( "movl (%1),%%ebx;"
                    "subl $1,%%ebx;"
                    "movl %%ebx,%0;"
            "REPEAT: addl $1,%%ebx;"
                    "testl %%ebx,%%ebx;"
                    "je END;"
                    "movzbl 0(%%ebx),%%ecx;"
                    "cmpl $97, %%ecx;"
                    "jb END;"
                    "cmpl $122,%%ecx;"
                    "ja END;"
                    "subb $32,0(%%ebx);"
                    "jmp REPEAT;"
              "END: movl %%ebx,(%0);"
                    :"=r" (ptr)
                    :"r"  (str)
                 );
   printf("converted string =%s\n", str);
 }

  int main()
  {
  int i;  
  char str[] = "convert";

  i = convert(str);
  return 0;

  }

Upvotes: 0

Views: 1195

Answers (2)

David Wohlferd
David Wohlferd

Reputation: 7483

The code in the accepted answer seems to have a few problems:

  • As written, this code does not compile (it references %1 when there is only 1 parameter), and it's missing a terminator on the 4th asm line.
  • This code does not correctly handle strings like "aBc".
  • This code does not use the "memory" clobber, even though it modifies memory.
  • This code (still) modifies a register that is not clobbered (ebx).
  • Doesn't work for x64.

How about something more like this:

char *convert(char *str)
{
   char *res = str;
   char temp;

   __asm__ __volatile__ (
         "dec %[str]\n"
      "REPEAT:\n\t"    
         "inc %[str]\n\t"
         "movb (%[str]), %[temp]\n\t"  /* Read the next char */
         "testb %[temp], %[temp]\n\t"
         "jz END\n\t"                  /* Is the char null */
         "cmpb $97, %[temp]\n\t"       /* >= 'a'? */
         "jb REPEAT\n\t"
         "cmpb $122, %[temp]\n\t"      /* <= 'z'? */
         "ja REPEAT\n\t"
         "subb $32, %[temp]\n\t"       /* Perform lowercase */
         "mov %[temp], (%[str])\n\t"   /* Write back to memory */
         "jmp REPEAT\n" 
      "END:\n"
         : [str] "+r" (str), [temp] "=r" (temp)
         : /* no inputs */
         : "memory"
   );

   /* Note that at this point, str points to the null. 
      str - res is the length. */

   return res;
}

This code:

  • Uses fewer registers (2 vs 4).
  • By using "=r" (temp), we are letting the compiler select the best register to use for scratch rather than forcing a specific register.
  • Only reads the memory once (instead of twice).
  • Returns a pointer to the string (instead of returning nothing?).
  • IMO, using %[temp] and %[src] is slightly easier to read than %1.
  • Using \n\t (instead of ;) makes the output from gcc -S easier to read.
  • This code modifies str (which is why it is listed as "+r").

Or if you really want to get fancy, write the code in 'c', and use gcc -O2 -S to see the output.

Upvotes: 1

goal4321
goal4321

Reputation: 121

Here is my solution slightly different than above, Thanks FUZxxi for pointing it out. I should say that retrieving assembly helps in a great extent, it can be hard to understand but it gives you the actual problem. I have written enough comments if somebody wants to understand what i'm trying to achieve.

/* code to convert from lower case to upper case */
int convert(char *str)
{
   __asm__ __volatile__ ( "movl %1,%%ebx;"  // Get the address of str
                "subl $1,%%ebx;"     
        "REPEAT: addl $1,%%ebx;"    
                "movl 0(%%ebx),%%edx"  // Move the contents to edx
                "movzbl %%dl,%%ecx;"   // moving last character to ecx
                "testl %%ecx,%%ecx;"   // compare if it's null
                "je END;"              
                "cmpl $97, %%ecx;"     
                "jb END;"
                "cmpl $122,%%ecx;"
                "ja END;"
                "subb $32,(%%ebx);"  // if its lower case, subtract 32
                "jmp REPEAT;" 
          "END:;"
                :           // No output specified
                :"r"  (str) //input
                :"ecx","edx" //clobbers
             );
  printf("converted string =%s\n", str);
}

Above code should work if you compile using "gcc -m32" option, if you are compiling for amd64. I encountered a problem doing that
"fatal error: sys/cdefs.h: No such file or directory "

solution : Install this package: libc6-dev-i386

Upvotes: 0

Related Questions