Reputation: 4628
I'm writing a C program to find the longest line in the user's input and print the line's length and the line itself. It succeeds at counting the characters but unpredictably fails at storing the line itself. Maybe I'm misunderstanding C's memory management and someone can correct me.
EDIT: followup question: I understand now that the blocks following the dummy
char are unallocated and thus open range for the computer to do anything with them, but then why does the storage of some chars still work? In the second example I mention, the program stores characters in the 'unallocated' blocks even though it 'shouldn't'. Why?
Variables:
getchar()
is stored in c
every time i getchar()
i
is the length (so far) of the current line i'm getchar()
ing fromlongest_i
is the length of the longest line so fartwostr
points to the beginning of the first of two strings: the first for the current line, the second for the longest line so far. When a line is discovered to be the longest, it is copied into the second string. If a future line is even longer, it overrides some of the second string but that's OK because I won't use it anymore -- the second string will now begin at a location farther to the right.dummy
gives twostr
a place to point toThis is how I visualize the memory used by the program's variables:
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|\n| 7|11|15|c |u |r |r |e |n |t |\0|e |s |t |\0|p |r |e |v |l |o |n |g |e |s |t |\0|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
true statements:
&c == 11
&i == 12
&longest_i == 13
&twostr = 14
&dummy = 15
program:
#include <stdio.h>
int main()
{
char c = '\0';
int i, longest_i;
char *twostr;
longest_i = i = 0;
char dummy = '\0';
twostr = &dummy;
while ((c=getchar()) != EOF)
{
if (c != '\n')
{
*(twostr+i) = c;
i++;
}
else
{
*(twostr+i) = '\0';
if (i > longest_i)
{
longest_i = i;
for (i=0; (c=*(twostr+i)) != '\0'; ++i)
*(twostr+longest_i+1+i) = c;
}
i = 0;
}
}
printf("length is %d\n", longest_i);
for (i=0; (c=*(twostr+longest_i+1+i)) != '\0'; ++i)
putchar(c);
return 0;
}
From *(twostr+longest_i+1))
until '\0'
is unpredictable. Examples:
input:
longer line
line
output:
length is 11
@
input:
this is a line
this is a longer line
shorter line
output:
length is 21
this is a longer lineÔÿ"
Upvotes: 2
Views: 438
Reputation: 8358
First, you will need to make sure that twostr has sufficient space to hold the string the string that you're managing. You will likely need to add some additional logic to allocate initial space as well as to allocate additional space when needed. Something like:
size_t twostrLen = 256;
char* twostr = malloc(twostrLen);
Then inserting data into this, you'll need to make sure you allocate additional memory if your index will exceed the current length of twostrLen:
if (i >= twostrLen) {
char* tmp = twostr;
twostrLen *= 2;
twostr = malloc(twostrLen);
memcpy(twostr, tmp, i-1);
free(tmp);
}
Where i
is the offset from twostr
that you're about to write to.
Finally, when copying from the current string to the longest string, your loop termination condition is c=*(twostr+i)) != '\0'
. This will trigger when c
matches '\0'
, exiting the loop before the terminating null is written. You'll need to make sure the null is written in order for your loop to print the string will work correctly. Adding the following after your inner-most for loop should address the issue:
*(twostr+longest_i+1+i) = 0;
Without this, our last loop will continue to read until a null character is encountered. This could be immediately (as seen in your first example where it appears to work), or could be some number of bytes later (like your second example, where additional characters are printed).
Again, remember to check that longest_i+1+i < twostrLen
before writing to that location.
Upvotes: 1
Reputation: 1177
Try the following code. Hope you will get your expected result:
#include <stdio.h>
#define LENGTH 1024
int main()
{
char c;
int i, longest_i;
char twostr[LENGTH]=""; // twostr points to a block of memory 1024 bytes long
char longest[LENGTH]=""; // so does longest, where we will store the longest string
longest_i = i = 0;
char dummy = '\0';
while ((c=getchar()) != EOF && i < LENGTH) // we check that i < 1024 so we don't
// go outside the bounds of our arrays
{
if (c != '\n')
{
*(twostr+i) = c;
i++;
}
else
{
twostr[i] = 0;
if (i > longest_i)
{
longest_i = i;
for (i = 0; twostr[i] != 0; ++i) { // 0 is the same as '\0'
longest[i] = twostr[i];
twostr[i] = 0; // fill twostr with NULLs
}
}
i = 0;
}
}
printf("length is: %d\n", longest_i);
printf("And the word is: ");
puts(longest);
printf("\n");
return 0;
}
Upvotes: 1
Reputation: 75130
Yes, you are correct in saying that you are misunderstanding C's memory management model.
In the line
*(twostr+i) = c;
for example, this would be right except for the fact that twostr
contains the address of a character and only *twostr
points to memory that you own. Adding anything to it except 0
to get another address and dereferencing that produces undefined behaviour because the size of the memory that belongs to dummy
is 1 byte.
So to make a long story short, you need to allocate a chunk of memory to store the string in. It's easiest just to show you how to do it right, so here is the code with corrections made:
#include <stdio.h>
int main()
{
char c;
int i, longest_i;
char twostr[1024]; // twostr points to a block of memory 1024 bytes long
char longest[1024]; // so does longest, where we will store the longest string
longest_i = i = 0;
char dummy = '\0';
while ((c=getchar()) != EOF && i < 1024) // we check that i < 1024 so we don't
// go outside the bounds of our arrays
{
if (c != '\n')
{
*(twostr+i) = c;
i++;
}
else
{
twostr[i] = 0;
if (i > longest_i)
{
longest_i = i;
for (i = 0; twostr[i] != 0; ++i) { // 0 is the same as '\0'
longest[i] = twostr[i];
twostr[i] = 0; // fill twostr with NULLs
}
}
i = 0;
}
}
printf("length is %d\n", longest_i);
for (i=0; longest[i] != 0; ++i)
putchar(longest[i]);
return 0;
}
Furthermore, the way you visualise your program's variables is incorrect. It would really be something like this:
Stack:
+---------+
| c | 1 byte
+---------+
| |
| |
| |
| i | 4 bytes
+---------+
| |
| |
| |
|longest_i| 4 bytes
+---------+
| |
| |
| |
~~~~~~~~~~~
| |
| |
| twostr | 1024 bytes
+---------+
| |
| |
| |
~~~~~~~~~~~
| |
| |
| longest | 1024 bytes
+---------+
Upvotes: 2
Reputation: 455142
You are not allocating memory to store the characters read by getchar
. Your pointer twostr
is a character pointer pointing to a character variable not an array, but you are treating it as a pointer to char array:
char *twostr;
....
char dummy = '\0';
twostr = &dummy;
....
*(twostr+i) = c; // when i here is > 0 you are accessing invalid memory.
What you need is something like:
char *twostr = malloc(MAX);
// use it.
free(twostr);
Where MAX
is defined to be one more than the max length of the string in user input.
Upvotes: 2
Reputation: 36082
twostr points to a character, however you are treating as a buffer.
what you need to do is to make a buffer instead with can hold more characters
e.g.
static char dummy[512];
twostr = dummy;
Upvotes: 1
Reputation: 1462
You're smashing your stack. You only have 1 byte allocated for char dummy. Really it should be something like:
char dummy[1024];
You also need to make sure you don't write more than 1024 or 1023 bytes to allow for the null terminator.
Upvotes: 2
Reputation: 8255
You're not actually allocating any memory to write into!
char dummy = '\0'; // creates a char variable and puts \0 into it
twostr = &dummy; // sets twostr to point to the address of dummy
After this, you're simply writing into the memory which comes after the char set aside by dummy, and writing over who-knows-what.
The easiest fix in this case would be to make dummy a pointer to a char, and then malloc a buffer to use for your strings (make it longer than the longest string you expect!)
For instance, buffer
below would point to 256 bytes (on most systems) of memory, allowing for a string up to 255 characters long (as you have the null terminator (\0) to store at the end).
char * buffer = (char *)malloc(sizeof(char) * 256);
Edit: This would allocate memory from the heap, which you should later free up by calling free(buffer);
when you're done with it. The alternative is to use up space on the stack as per Anders K's solution.
Upvotes: 4