Kaherdin
Kaherdin

Reputation: 2127

C : String manipulation with pointer

I come from JS and I'm still struggling to understand the notion of pointer, for that, I've created a function to lowercase a string.

char    *ft_strlowcase(char *str)
{
    int i;

    i = 0;
    while (str[i])
    {
        if (str[i] >= 'A' && str[i] <= 'Z')
        {
            str[i] = str[i] + 32;
        }       
        i++;
    }
    return (str);
}

Now, I want to test it and in my logic, I can pass a pointer to my function, like this :

int main(void)
{   
    //char  str1[] = "AbCdEfGhI"; //WORKING
    //char  *str1 = "AbCdEfGhI"; //NOT WORKING
    
   //char   *str1; //NOT WORKING
   //str1[] = "AbCdEfGhI"; //NOT WORKING

    char    *str1; //NOT WORKING
    str1 = "AbCdEfGhI"; //NOT WORKING
    
    printf("Lowercase : %s\n", str1);
    ft_strlowcase(&str1);
    printf("Uppercase : %s\n", str1); 
}

But it doesn't work, the only way I can make it works, is to pass an array declare in one line. What did I miss ? Can I make it work with a pointer synthax and without any complex function (memloc...) ?

Upvotes: 0

Views: 142

Answers (1)

Kaz
Kaz

Reputation: 58647

The correct version of your call is ft_strlowcase(str1). However, that still perpetrates undefined behavior because the string literal object created by str1 = "AbCdEfGhI" is being modified. String literals are part of the program image; modifying a string literal is de facto self-modifying code, whose behavior is not defined by ISO C.

This is not actually a big difference between dynamic languages and C here; ANSI Common Lisp is the same way with regard to modifying literal objects.

By the way, your C compiler should be warning you that the ft_strlowcase(&str1) call is ill-typed: it passes a char ** pointer to a function which expects char *.

Now regarding this:

//char  str1[] = "AbCdEfGhI"; //WORKING

Yes that will work for two (and a half) reasons.

  1. Firstly, "AbCdEfGhI" is no longer a literal object here, but only initializer syntax. The object is the str1 array, and ""AbCdEfGhI" specifies its size and initial contents. This str1 array is not a string literal; it is mutable. It is well defined to do something like str[0]++.

  2. Because str1 is an array, the expression &str1 produces a "pointer to array" value. But this pointer points to the same address as the first character of that array. So that is to say, because str is an array, str, &str[0] and &str are all the same pointer. The first two have the type char *, whereas &str has the type char (*)[10]: pointer to an array of 10 char. The expression ft_strlowcase(&str1) still requires a diagnostic, which makes your program undefined: you are asking the compiler to convert one pointer type to an incompatible type without a cast. However, if the compiler simply emits the diagnostic and then supplies the conversion as if there were a cast, then you will get the apparently correct behavior. You need ft_strlowcase(str1) (vastly preferred) or else ft_strlowcase((char *) &str1) (provide the cast, so no diagnostic is required). Programs that require a diagnostic have undefined behavior if they are translated and executed anyway!

Lastly, your ft_strlowcase function is verbose. More idiomatic C code looks like this, among other possibilities:

char *ft_strlowcase(char *str)
{
   for (char *ptr = str; *ptr; ptr++) {
     if (*ptr >= 'A' && *ptr <= 'Z')
        *ptr += 32;
   }
   return str;
}

Believe it or not, is more readable to experienced C programmers, because it condenses together several well-understood, well-worn idioms into a concise clump.

In your original code, this is particularly something that should be avoided:

int i;
i = 0;

For no reason at all, you've declined an obvious opportunity to define an initialized variable:

int i = 0;

Initialization should always be preferred to assignment. They are different. Initialization means that an object is "born" into the world with a value; at no point in its program-visible existence does not not have a value. In C, we can define a local variable object without initializing it, which leaves it "indeterminately-valued". That habit creates the risk of using an indeterminately-valued object, which is undefined behavior.

Upvotes: 5

Related Questions