Eyad Ayman
Eyad Ayman

Reputation: 11

Why do we need a null terminator only in strings in C?

I'm taking CS50X. I'm on week 2 now. my question is: why do we need a null character '\0' in strings (aka null terminated char arrays) to mark its end, while we don't need it in normal char arrays or a non-string data type array such as an int array , like while printing both arrays (the null terminated char array and the int array for example) what does mark the end of the second array?

I tried to demonstrate how strings are implemented for myself with some code:

this code worked printing "hi!" in the terminal

this also worked printing the three scores

Why in the first code did we need an additional place in the array for the null character? Couldn't we have used i < 3 instead as we did in the second code? A character array, like any other array, has a specific length, so what changed when we decided to treat string as a character array?

Upvotes: 0

Views: 981

Answers (3)

4386427
4386427

Reputation: 44274

Short answer: To be able to store short strings in a bigger array.

Explanation:

Assume you have (one way or another) allocated a memory area capable of holding M characters and you want to store a string into that memory.

If the string has exactly M characters you can print it like:

for (i = 0; i < M; ++i) putchar(str[i]);

In principle it's not problem... You know the value M from the size of the memory area (note: this is only true in some cases but for now let's assume that).

But what if you want to store and later print a string with N (N < M) characters in that memory?

When printing it, you could of cause do:

for (i = 0; i < N; ++i) putchar(str[i]);

But from where do you get the value N?

Sometimes N is 5 (e.g. the string "Hello"), sometimes N is 13 (e.g. the string "stackoverflow"), and so on.

One solution would be to keep N in a separate variable that you update whenever you change the string.

Another solution would be to use a sentinel value to indicate "End of string" and store that special value as part of the string.

There are pros and cons in both solutions.

The designers of C decided to go with the second solution. So consequently we must always make sure to include the sentinel (the NUL) when dealing with strings in C.

The print can now be written:

for (i = 0; str[i] != '\0'; ++i) putchar(str[i]);

and it will work no matter what length the string has.

BTW:

Interesting read: https://stackoverflow.com/a/1258577/4386427

Upvotes: 1

Gene
Gene

Reputation: 46960

The truth is that you don't need null terminators. They're just the convention that the C library chose to represent the end of the string.

For some purposes, it's a terrible choice. An example: when strings might contain nulls. Another: when string length must be computed often; the only way is to traverse the whole (potentially very long) string.

A method without these problems would be to represent a string as a char array (not null terminated) and an explicit length paired with it:

typedef struct string_s {
  char *text;
  size_t len;
} STRING;

And in fact you'll find systems written in C that do this.

The down side is that they can't use standard libraries for concatenation, i/o, etc. They need to supply their own. Also, size_t is up to 8 bytes while a terminating null is only one. When C was invented, that difference was a fairly big deal. In some applications (like very small embedded processors), it still is.

Upvotes: 4

chux
chux

Reputation: 153456

Why do we need a null terminator

To indicate the length.


When using functions on 1) a string or 2) an array, the function cannot receive the string or the array. It can receive a pointer to the string or the array. It will be a pointer to the first character of the string or array.

Now how does the function know now long the string or array is?


With strings, the function knows the length by inspecting the data and when it detects a null character, it knows that is the end of the string. No additional parameter was needed to be sent to the function.

foo_string(pointer_to_string_beginning);

With arrays, the caller needs to send the element count of the array to the function in addition to the pointer (in either prescribed order). The function can not use the data of the array to know the end as no value is reserved to indicate the "end".

foo_array(element_count_of_the_array, pointer_to_array_beginning);

If sending 2 parameters is OK, use arrays and size. Else for text, use 1 parameter for a string.

For text, strings are the common approach used since the 1970s.


How to return values to indicate a a string or array in the next concern, not yet addressed here.

Upvotes: 3

Related Questions