George Hilliard
George Hilliard

Reputation: 15952

Efficient algorithm to step through all of linked list (in C++)

Suppose I have a doubly-linked list, with a byte associated with each element. Wikipedia has a good visual; pretend the numbers are hexadecimal numbers:

https://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Doubly-linked-list.svg/500px-Doubly-linked-list.svg.png

Now, the naïve ("immediately obvious") way to build a string from the list, given a pointer to the last node in the string (in this example, the 37 node), is:

using std::string;

string node::makeString()
{
  return this->prev->makeString() + this->data;
}

The goal is the string "\0x12\0x99\0x37". However, this function requires lots of reallocations of the string being built and lots of function call overhead (it can't be tail-call optimized); doubtless there are other inefficiences that I'm not aware of.

Is there a better way? Of course, I'm not just looking to minimize theoretical time complexity; I'm really trying to find a method that will be fastest in practice.

Upvotes: 1

Views: 248

Answers (6)

Fred Foo
Fred Foo

Reputation: 363797

Start with the empty std::string, walk back to the front of the list, then loop through the nodes and push_back onto the string. That takes linear time, which is optimal for this problem.

Further optimizations are possible if you know up front how long the list is. In that case, you can start with a string of the appropriate length and insert characters into it directly.

Upvotes: 3

SigTerm
SigTerm

Reputation: 26429

Is there a better way?

Sure.

  1. Locate beginning of the list. While locating beginning of the list, count total number of nodes (if it isn't already available) and calculate total string size for final string.
  2. Preallocate string of required size using std::string::reserve()
  3. Walk through the list from first node to the last, adding data to the end of previously preallocated string. You can use std::string::append() for that.

Upvotes: 2

Vlad from Moscow
Vlad from Moscow

Reputation: 311088

The bottle-neck is reallocating the string. So I would firstly count the number of nodes and after that I would build the string. For example

std::string::size_type n = 1;
for ( ; node->prev; node = node->prev ) ++n;
std::string s;
s.reserve( n );
for ( ; node->next; node = node->next ) s.push_back( node->data );

Upvotes: 0

Jerry Coffin
Jerry Coffin

Reputation: 490518

Given the constraints at hand (that you're basically stuck walking the list in reverse), it's probably best to build the string in reverse as well, then when all the characters have been added, reverse the string.

The way you're doing things right now you're getting quadratic complexity -- every time you insert another character, put that character into a string, the copy all the existing characters to the new string, so each insertion is linear and N insertions is roughly O(N2).[Note: actually, I'd misread the code -- it's bad, but not quite this bad] As it is right now, we can expect every character to be copied at least twice -- once to the stack, and once to the destination string. The inefficiency is probably most apparent if you think in terms of memory bandwidth. At bare minimum, each call has to read a pointer, write the current character to the stack and write a return address, all to copy one byte from the linked list to the destination string. Assuming a 64-bit implementation, we're looking at a ratio of something like 18:1 in terms of reading and writing pointers (and other overhead) vs. copying the data we actually care about.

By building the string backward, then reversing it, you add new characters at the end of the string, where you can expect it to happen quickly. Then you do all that extra moving only once instead of once for every character you add.

If you were using std::vector<char>, you could state categorically that adding a character at the end of the string was amortized constant complexity. With std::string we don't (that I recall) get a complexity guarantee, but it would take an amazingly terrible implementation for it to be as bad as a recursive call just to copy one character.

Another possibility would be to use an std::deque<char> instead of a string. With a deque, you can insert characters at the front without moving all the other characters to make room. This does have two shortcomings: the result isn't (normally) a contiguous block of memory, and you typically get an extra level of indirection, so access to the data after it's built is marginally slower.

Upvotes: 1

Skizz
Skizz

Reputation: 71100

Personally, I'd create a linked list of strings, or rather, char arrays, and then fill each node in backwards.

struct StringNode
{
  char buffer [20];
  struct StringNode *next;
};

StringNode *node = new StringNode;
node->buffer [19] = '\0';
node->next = 0;
size_t output = 18;
size_t count = 1;

for (ptr = last item ; ptr ; ptr = ptr->prev)
{
  node->buffer [output] = ptr->character;
  ++count;
  if (output)
  {
    --output;
  }
  else
  {
    StringNode *newnode = new StringNode;
    newnode->buffer [19] = '\0';
    newnode->next = node;
    output = 18;
    node = newnode;
  }
}

string output (count); // preallocate enough storage for whole string and initialise to an empty string

while(node)
{
  output += &node->buffer [output+1];
  // or: cout << &node->buffer [output+1];
  StringNode *nextnode = node->next;
  delete node;
  node = nextnode;
  output = -1;
}

Upvotes: 0

pippin1289
pippin1289

Reputation: 4939

The inefficiency in your solution is due to the recursion. For a linked list, set up a string and use a simple while loop. This will lead to better performance, as there will not be the overhead of one function call per string.

string makeString() {
  Node* p = l.end(); //l is the linked list. end is its tail node
  string s = "";
  while(p != NULL) {
    s = p.value() + s; //append the value to the string
    p = p.prev(); //advance p to the prev node
  }
  return s;
}

Of course for even better performance, I would consider not using a linked data structure as they can lead to inefficiencies dealing with locality in memory.

Upvotes: 0

Related Questions