Reputation: 302
I am new to C++ and I still have trouble understanding when I should use pointers, references, std::move. I have programmed a short function to split strings using a delimiter.
std::vector<std::string> mylib::split(std::string string, char delimiter) {
std::vector<std::string> result = std::vector<std::string>();
std::string cache = std::string();
cache.reserve(string.size());
for (char c : string) {
if (c == delimiter) {
result.push_back(std::string(cache));
cache.clear();
} else {
cache += c;
}
}
cache.shrink_to_fit();
result.push_back(cache);
return result;
}
I have a few questions to this function: Should I use
std::vector<std::string> mylib::split(std::string string, char delimiter) {
or
std::vector<std::string> mylib::split(std::string &string, char delimiter) {
and should it be
result.push_back(std::string(cache));
or
result.push_back(std::move(std::string(cache)));
And do I have to care about the destruction of any of the used objects or could I use this function just like that? Also, if there are any other ways to improve this method I would be happy to hear your ideas.
Upvotes: 1
Views: 142
Reputation: 56587
The rule of thumb is:
Use &
when you want to signal that your function can modify the argument and changes should be visible outside. Also passing an argument via &
does not create a copy.
Use const
when you want to indicate that the function is not going to modify the object. Although it will copy it.
Use const &
to combine both those situations above: the object will not be modified by the function but also will not be copied (which is important when copying is expensive like in the case of strings)
So for you the best solution is: use const std::string& value
(please change the name of the variable). You don't modify the string and it may be too big to copy it.
As for std::move
. What it does is (basically) it turns a non-temporary object to a temporary. So as you can see using std::move
on temporaries (your case) is pointless.
Why do we do that? In order to allow the C++ compiler to apply aggressive optimizations. Consider this code:
std::string text = "abcd";
result.push_back(text);
C++ doesn't know that text
is not going to be used anymore. So it has to copy it. But with this:
std::string text = "abcd";
result.push_back(std::move(text));
you tell the C++ compiler: "hey, I'm not going to use text
variable anymore, so you don't have to copy it, just move its internals to the vector". And all you have to know is that in the case of strings copying is more expensive (linear complexity) than moving (always constant time).
Warning - an opinion incoming: I find the std::move
name really confusing. It doesn't actually move anything. It's just a static cast. Why not call it std::cast_to_temp
or something?
Anyway this result.push_back(std::move(std::string(cache)));
is wrong. Pointless. You don't avoid a copy and std::move
does nothing. But this result.push_back(std::move(cache));
indeed makes sense. But careful analysis has to be made: is cache
really not needed afterwards? It looks like it is (although I didn't dive deeply into your code).
Finally you only care about destruction when you construct, i.e. for each new
you need a delete
. You don't have new
, you don't need delete
*.
* that's not always true, sometimes you deal with a nasty code that does an implicit, invisible new
for you but actually forces you to do delete
. Yeah, sometimes it is hard. But AFAIK this doesn't happen in the standard (or any other self respecting) library. This is a very bad practice.
Final note: of course this is C++, in reality everything is much more complicated, there are exceptions to each rule and so on, and so on. But don't worry about details at the moment, it is ok to learn gradually.
Upvotes: 1
Reputation: 11440
The best would be
std::vector<std::string> mylib::split(const std::string &string, char delimiter) {
as you don't copy more than needed and you guarantee to your caller you won't modify their string. And it makes the API way clearer on the intent.
result.push_back(std::move(std::string(cache)));
IMO (and not everybody will agree), you should not worry just yet about std::move'ing the string. Yes, you could, because cache is not used in either case (or cleared anyway). You should only start caring when the performance becomes an issue. And since you are copying char one by one, I doubt the highest performance improvement will come from move semantics.
Dropping the initializers and going with token copy as discussed:
std::vector<std::string> split(const std::string& string, char delimiter)
{
std::vector<std::string> result;
size_t pos = 0;
for (size_t scan = 0; scan < string.size(); ++scan)
{
if (string[scan] == delimiter)
{
result.push_back(string.substr(pos, scan - pos));
pos = scan + 1;
}
}
result.push_back(string.substr(pos, string.size() - pos));
return result;
}
Upvotes: 2
Reputation: 2479
Pass by value or by reference:
This will create a copy of string:
std::vector<std::string> mylib::split(std::string string, char delimiter)
This will pass a reference of string:
std::vector<std::string> mylib::split(std::string &string, char delimiter)
In the above cases, you would prefer to pass reference, because you return a std::vector and you only use string to read a part of it to push it to the vector. Now because you only read it, it would even be better to make it const:
std::vector<std::string> mylib::split(const std::string &string, char delimiter)
Then you are 100% sure that the variable you gave to the split function remains unchanged. Imagine the following:
std::string string = "some,values";
If you pass string to split by value:
std::vector<std::string> mylib::split(std::string string, char delimiter) {
string = "something else";
...
}
After calling split, you read the string variable:
std::cout << string << std::endl;
This will print "some,values".
If you pass by reference however:
std::vector<std::string> mylib::split(std::string &string, char delimiter) {
string = "something else";
}
It will print "something else", basically your modifying the real string.
If you make it const, then the compiler will not allow you to overwrite string in the split function. So unless your variable needs to be changed in the function, pass a const reference to it.
Moving or copying:
This will create a copy of string:
result.push_back(std::string(cache));
This will move the contents of cache.
result.push_back(std::move(cache));
If you know that creating a copy will usually cost more than moving things around, then you understand that moving will be more efficient, i.e. faster. But then again, adding move calls for a string sounds like premature optimization. Unless you are dealing with a lot of data, I don't see a reason to move a string instead of copying because it makes the code less readable and the performance gain would probably be minimal.
Pointers vs references
Basically you can think of a pointer like you think of a reference. It's an address to a piece of memory. The syntax is different, pointers can be null while references can't. Pointers can also be allocated on the heap while references are always allocated on the stack.
std::string string = "some,values";
std::vector<std::string> mylib::split(std::string *string, char delimiter) {
*string = "something else";
...
}
std::cout << *string << std::endl; // will print "something else"
std::cout << string << std::endl; // will print the address of the pointer
Notice the * in split is telling that you pass a pointer, the * before string '*string = "something else"' means that the pointer is dereferenced and that the value is written to the location of the pointer. Same for the print, we read the value and print it by dereferencing the pointer.
I hope that clears up some doubts you have.
Upvotes: 1
Reputation: 70
You should read more about C++ pass by reference vs. pass by value. But to make it simple,
std::vector<std::string> mylib::split(std::string string, char delimiter) {
when you do not want to change the variable itself when you pass it to function. This means you pass string object by value and you make a copy inside a function of that string. std::vector<std::string> mylib::split(std::string &string, char delimiter) {
mean you are passing string object by reference. So when you change the string inside a function you will change the string itself independent where have you declared it. Also it is more performance friendly to pas by reference since you do not have to copy the object.And do I have to care about the destruction of any of the used objects or could I use this function just like that?
No, you do do not have to worry about destruction of any objects since you only use STL and not user defined objects. Moreover, it should be like that: result.push_back(std::string(cache))
. Don't use std::move
when you push object to the container.
Upvotes: 0