Reputation: 878
I've written a function that takes a string and returns a const char * which contains an encoded version of that string. I call this function, and then create a new string. In doing so, I am somehow inadvertently changing the value pointed to my const char *, something which I thought was impossible.
However, when I don't use my own function, but just hardcode a value into my const char array, the value does not change when I create a string. Why is there a difference here, and why would I be able to change the value of a const char array anyways?
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <iostream>
using namespace std;
// returns "@username@FIN"
const char* encodeUsername(string username)
{
username = "@" + username + "@FIN";
return username.c_str();
}
int main(void)
{
string jack("jack");
const char* encodedUsername = "@jack@FIN";
string dummy("hi");
printf("%s\n", encodedUsername); //outputs "@jack@FIN", as expected.
string tim("tim");
const char* encodedUsername2 = encodeUsername(tim);
string dummy2("hi");
printf("%s\n", encodedUsername2); //outputs "hi". Why?
}
Upvotes: 4
Views: 144
Reputation: 12795
To understand why this happens you need to understand several intrinsic properties of C++.
char* moo()
{
char* a = new char[20];
strcpy(a, "hello");
delete[] a;
return a;
}
Note that even though I just deleted a
, I can return a pointer to it. The calling side will receive that pointer and will have no idea that it points to a freed-up memory. Moreover, if you immediately print the value of the returned value, you will very likely see "hello", because delete
usually does not zero-out memory it frees up.
std::string
is, roughly speaking, a wrapper around char*
that hides all the allocations and deallocations behind a very nice interface, so that you don't need to care about memory management. The constructor of std::string
and all operations on it allocate or reallocate the array, and the destructor deallocates it.
When you pass something into a function by value (as you do in your encodeUsername
function in line username = "@" + username + "@FIN"
), it creates a new object with a copy of what you are passing, which will be destroyed as soon as the function ends. So in this case, as soon as encodeUsername
returns, username
is destroyed, because it was passed by value, and is contained within the function's scope. Since the object is destroyed, its destructor is called, and at that point the string is deallocated. The pointer to the raw data that you retrieved by calling to c_str()
now points to something that does not exist any longer.
When you allocate an object immediately following a deallocation, you are very likely to reuse the memory of the object that was just freed. In your case, as you create a new string, tim
, it allocates memory at the same address that was just deallocated when encodeUsername
returned.
Now, how can you fix it?
First, if you don't care about the input string (as, if you are OK with overwriting it), you can just pass it by reference:
const char* encodeUsername(string& username)
This will fix it, because username
is not a copy, so it is not destroyed at the end of the function. The problem now, however, is that this function will change the value of the string you are passing in, which is very undesirable and creates an unintuitive interface.
Second, you can allocate a new char array before returning it, and then free it at the end of the calling function:
const char* encodeUsername(string username)
{
username = "@" + username + "@FIN";
return strdup(username.c_str());
}
and then at the end of main:
free(encodedUsername);
free(encodedUsername2);
(note that you have to use free
and not delete[]
because the array was allocated using strdup
)
This will work because the char array we return is allocated on the heap right before we return and is not freed. It comes at a price that now the calling function need to free it up, which is, again, an unintuitive interface.
Finally, the proper solution would be to return an std::string
instead of a char pointer, in which case the std::string
will take care of all the allocations and deallocations for you:
string encodeUsername(string username)
{
username = "@" + username + "@FIN";
return username;
}
And then in the main function:
string encodedUsername2 = encodeUsername(tim);
printf("%s\n", encodedUsername2.c_str());
Upvotes: 3
Reputation: 241701
The lifetime of username
terminates when encodeUsername
returns, leaving the pointer returned by that function dangling. In other words, it is Undefined Behaviour, which in this case manifests itself in the reuse of the memory pointed to by encodeUsername
's return value for the newly-created string.
That won't happen if you return the std::string
itself.
Upvotes: 1