Hodaya Shalom
Hodaya Shalom

Reputation: 4417

Strings behvior on C

I want to understand a number of things about the strings on C:

I could not understand why you can not change the string in a normal assignment. (But only through the functions of string.h), for example: I can't do d="aa" (d is a pointer of char or a array of char). Can someone explain to me what's going on behind the scenes - the compiler gives to run such thing and you receive segmentation fault error.

Something else, I run a program in C that contains the following lines:

char c='a',*pc=&c;
printf("Enter a string:");
scanf("%s",pc);
printf("your first char is: %c",c);
printf("your string is: %s",pc);
  1. If I put more than 2 letters (on scanf) I get segmentation fault error, why is this happening?
  2. If I put two letters, the first letter printed right! And the string is printed with a lot of profits (incorrect)
  3. If I put a letter, the letter is printed right! And the string is printed with a lot of profits and at the end something weird (a square with four numbers containing zeros and ones)

Can anyone explain what is happening behind?

Please note: I do not want the program to work, I did not ask the question to get suggestions for another program, I just want to understand what happens behind the scenes in these situations.

Upvotes: 0

Views: 118

Answers (3)

Strings almost do not exist in C (except as C string literals like "abc" in some C source file).

In fact, strings are mostly a convention: a C string is an array of char whose last element is the zero char '\0'.

So declaring

 const char s[] = "abc";

is exactly the same as

 const char s[] = {'a','b','c','\0'};

in particular, sizeof(s) is 4 (3+1) in both cases (and so is sizeof("abc")).

The standard C library contains a lot of functions (such as strlen(3) or strncpy(3)...) which obey and/or presuppose the convention that strings are zero-terminated arrays of char-s.

Better code would be:

char buf[16]="a",*pc= buf;
printf("Enter a string:"); fflush(NULL);
scanf("%15s",pc);
printf("your first char is: %c",buf[0]);
printf("your string is: %s",pc);

Some comments: be afraid of buffer overflow. When reading a string, always give a bound to the read string, or else use a function like getline(3) which dynamically allocates the string in the heap. Beware of memory leaks (use a tool like valgrind ...)

When computing a string, be also aware of the maximum size. See snprintf(3) (avoid sprintf).

Often, you adopt the convention that a string is returned and dynamically allocated in the heap. You may want to use strdup(3) or asprintf(3) if your system provides it. But you should adopt the convention that the calling function (or something else, but well defined in your head) is free(3)-ing the string.

Your program can be semantically wrong and by bad luck happening to sometimes work. Read carefully about undefined behavior. Avoid it absolutely (your points 1,2,3 are probable UB). Sadly, an UB may happen to sometimes "work".

To explain some actual undefined behavior, you have to take into account your particular implementation: the compiler, the flags -notably optimization flags- passed to the compiler, the operating system, the kernel, the processor, the phase of the moon, etc etc... Undefined behavior is often non reproducible (e.g. because of ASLR etc...), read about heisenbugs. To explain the behavior of points 1,2,3 you need to dive into implementation details; look into the assembler code (gcc -S -fverbose-asm) produced by the compiler.

I suggest you to compile your code with all warnings and debugging info (e.g. using gcc -Wall -g with GCC ...), to improve the code till you got no warning, and to learn how to use the debugger (e.g. gdb) to run your code step by step.

Upvotes: 2

Suji
Suji

Reputation: 1326

char c="a"; is a wrong declaration in c language since even a single character is enclosed within a pair of double quotes("") will treated as string in C because it is treated as "a\0" since all strings ends with a '\0' null character. char c="a"; is wrong where as char c='c'; is correct.

Also note that the memory allocated for char is only 1byte, so it can hold only one character, memory allocation details for datatypes are described bellow

 Operator Types

Upvotes: 0

Jeyaram
Jeyaram

Reputation: 9484

If I put more than 2 letters (on scanf) I get segmentation fault error, why is this happening?

Because memory is allocated for only one byte. See char c and assigned with "a". Which is equal to 'a' and '\0' is written in one byte memory location.

If scanf() uses this memory for reading more than one byte, then this is simply undefined behavior.

Upvotes: 1

Related Questions