Reputation: 15952
Suppose I have a doubly-linked list, with a byte associated with each element. Wikipedia has a good visual; pretend the numbers are hexadecimal numbers:
Now, the naïve ("immediately obvious") way to build a string from the list, given a pointer to the last node in the string (in this example, the 37
node), is:
using std::string;
string node::makeString()
{
return this->prev->makeString() + this->data;
}
The goal is the string "\0x12\0x99\0x37"
. However, this function requires lots of reallocations of the string being built and lots of function call overhead (it can't be tail-call optimized); doubtless there are other inefficiences that I'm not aware of.
Is there a better way? Of course, I'm not just looking to minimize theoretical time complexity; I'm really trying to find a method that will be fastest in practice.
Upvotes: 1
Views: 248
Reputation: 363797
Start with the empty std::string
, walk back to the front of the list, then loop through the nodes and push_back
onto the string. That takes linear time, which is optimal for this problem.
Further optimizations are possible if you know up front how long the list is. In that case, you can start with a string of the appropriate length and insert characters into it directly.
Upvotes: 3
Reputation: 26429
Is there a better way?
Sure.
std::string::reserve()
std::string::append()
for that.Upvotes: 2
Reputation: 311088
The bottle-neck is reallocating the string. So I would firstly count the number of nodes and after that I would build the string. For example
std::string::size_type n = 1;
for ( ; node->prev; node = node->prev ) ++n;
std::string s;
s.reserve( n );
for ( ; node->next; node = node->next ) s.push_back( node->data );
Upvotes: 0
Reputation: 490518
Given the constraints at hand (that you're basically stuck walking the list in reverse), it's probably best to build the string in reverse as well, then when all the characters have been added, reverse the string.
The way you're doing things right now you're getting quadratic complexity -- every time you insert another character, put that character into a string, the copy all the existing characters to the new string, so each insertion is linear and N insertions is roughly O(N2).[Note: actually, I'd misread the code -- it's bad, but not quite this bad] As it is right now, we can expect every character to be copied at least twice -- once to the stack, and once to the destination string. The inefficiency is probably most apparent if you think in terms of memory bandwidth. At bare minimum, each call has to read a pointer, write the current character to the stack and write a return address, all to copy one byte from the linked list to the destination string. Assuming a 64-bit implementation, we're looking at a ratio of something like 18:1 in terms of reading and writing pointers (and other overhead) vs. copying the data we actually care about.
By building the string backward, then reversing it, you add new characters at the end of the string, where you can expect it to happen quickly. Then you do all that extra moving only once instead of once for every character you add.
If you were using std::vector<char>
, you could state categorically that adding a character at the end of the string was amortized constant complexity. With std::string
we don't (that I recall) get a complexity guarantee, but it would take an amazingly terrible implementation for it to be as bad as a recursive call just to copy one character.
Another possibility would be to use an std::deque<char>
instead of a string
. With a deque, you can insert characters at the front without moving all the other characters to make room. This does have two shortcomings: the result isn't (normally) a contiguous block of memory, and you typically get an extra level of indirection, so access to the data after it's built is marginally slower.
Upvotes: 1
Reputation: 71100
Personally, I'd create a linked list of strings, or rather, char arrays, and then fill each node in backwards.
struct StringNode
{
char buffer [20];
struct StringNode *next;
};
StringNode *node = new StringNode;
node->buffer [19] = '\0';
node->next = 0;
size_t output = 18;
size_t count = 1;
for (ptr = last item ; ptr ; ptr = ptr->prev)
{
node->buffer [output] = ptr->character;
++count;
if (output)
{
--output;
}
else
{
StringNode *newnode = new StringNode;
newnode->buffer [19] = '\0';
newnode->next = node;
output = 18;
node = newnode;
}
}
string output (count); // preallocate enough storage for whole string and initialise to an empty string
while(node)
{
output += &node->buffer [output+1];
// or: cout << &node->buffer [output+1];
StringNode *nextnode = node->next;
delete node;
node = nextnode;
output = -1;
}
Upvotes: 0
Reputation: 4939
The inefficiency in your solution is due to the recursion. For a linked list, set up a string and use a simple while loop. This will lead to better performance, as there will not be the overhead of one function call per string.
string makeString() {
Node* p = l.end(); //l is the linked list. end is its tail node
string s = "";
while(p != NULL) {
s = p.value() + s; //append the value to the string
p = p.prev(); //advance p to the prev node
}
return s;
}
Of course for even better performance, I would consider not using a linked data structure as they can lead to inefficiencies dealing with locality in memory.
Upvotes: 0