I have to permute N first elements of a singly linked list of length n, randomly. Each element is defined as: typedef struct E_s { struct E_s *next; }E_t; I have a root element and I can traverse the whole linked list of size n. What is the most efficient technique to permute only N first elements (starting from root) randomly? So, given a->b->c->d->e->f->...x->y->z I need to make smth. like f->a->e->c->b->...x->y->z My specific case: n-N is about 20% relative to n I have limited RAM resources, the best algorithm should make it in place I have to do it in a loop, in many iterations, so the speed does matter The ideal randomness (uniform distribution) is not required, it's Ok if it's "almost" random Before making permutations, I traverse the N elements already (for other needs), so maybe I could use this for permutations as well UPDATE: I found <a href="http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0F-45FKW7V-6B&_user=10&_coverDate=10%2F05%2F1992&_rdoc=1&_fmt=high&_orig=search&_origin=search&_sort=d&_docanchor=&view=c&_searchStrId=1576191476&_rerunOrigin=google&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=53029a9616acb1e9e90017e288d882ef&searchtype=a">this paper . It states it presents an algorithm of O(log n) stack space and expected O(n log n) time.

Reputation: 30492

Randomly permute N first elements of a singly linked list

I have to permute N first elements of a singly linked list of length n, randomly. Each element is defined as:

typedef struct E_s
{
  struct E_s *next;
}E_t;

I have a root element and I can traverse the whole linked list of size n. What is the most efficient technique to permute only N first elements (starting from root) randomly?

So, given a->b->c->d->e->f->...x->y->z I need to make smth. like f->a->e->c->b->...x->y->z

My specific case:

n-N is about 20% relative to n
I have limited RAM resources, the best algorithm should make it in place
I have to do it in a loop, in many iterations, so the speed does matter
The ideal randomness (uniform distribution) is not required, it's Ok if it's "almost" random
Before making permutations, I traverse the N elements already (for other needs), so maybe I could use this for permutations as well

UPDATE: I found this paper. It states it presents an algorithm of O(log n) stack space and expected O(n log n) time.

Upvotes: 19

Answers (11)

Emilio M Bumachar

Reputation: 2613

If both the following conditions are true:

you have plenty of program memory (many embedded hardwares execute directly from flash);
your solution does not suffer that your "randomness" repeats often,

Then you can choose a sufficiently large set of specific permutations, defined at programming time, write a code to write the code that implements each, and then iterate over them at runtime.

Upvotes: 0

salva

Reputation: 10244

The list randomizer below has complexity O(N*log N) and O(1) memory usage.

It is based on the recursive algorithm described on my other post modified to be iterative instead of recursive in order to eliminate the O(logN) memory usage.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

typedef struct node {
    struct node *next;
    char *str;
} node;


unsigned int
next_power_of_two(unsigned int v) {
    v--;
    v |= v >> 1;
    v |= v >> 2;
    v |= v >> 4;
    v |= v >> 8;
    v |= v >> 16;
    return v + 1;
}

void
dump_list(node *l) {
    printf("list:");
    for (; l; l = l->next) printf(" %s", l->str);
    printf("\n");
}

node *
array_to_list(unsigned int len, char *str[]) {
    unsigned int i;
    node *list;
    node **last = &list;
    for (i = 0; i < len; i++) {
        node *n = malloc(sizeof(node));
        n->str = str[i];
        *last = n;
        last = &n->next;
    }
    *last = NULL;
    return list;
}

node **
reorder_list(node **last, unsigned int po2, unsigned int len) {
    node *l = *last;
    node **last_a = last;
    node *b = NULL;
    node **last_b = &b;
    unsigned int len_a = 0;
    unsigned int i;
    for (i = len; i; i--) {
        double pa = (1.0 + RAND_MAX) * (po2 - len_a) / i;
        unsigned int r = rand();
        if (r < pa) {
            *last_a = l;
            last_a = &l->next;
            len_a++;
        }
        else {
            *last_b = l;
            last_b = &l->next;
        }
        l = l->next;
    }
    *last_b = l;
    *last_a = b;
    return last_b;
}

unsigned int
min(unsigned int a, unsigned int b) {
    return (a > b ? b : a);
}

randomize_list(node **l, unsigned int len) {
    unsigned int po2 = next_power_of_two(len);
    for (; po2 > 1; po2 >>= 1) {
        unsigned int j;
        node **last = l;
        for (j = 0; j < len; j += po2)
            last = reorder_list(last, po2 >> 1, min(po2, len - j));
    }
}

int
main(int len, char *str[]) {
    if (len > 1) {
        node *l;
        len--; str++; /* skip program name */
        l = array_to_list(len, str);
        randomize_list(&l, len);
        dump_list(l);
    }
    return 0;
}

/* try as:   a.out list of words foo bar doz li 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
*/

Note that this version of the algorithm is completely cache unfriendly, the recursive version would probably perform much better!

Upvotes: 0

unsym

Reputation: 2200

There is an algorithm takes O(sqrt(N)) space and O(N) time, for a singly linked list.

It does not generate a uniform distribution over all permutation sequence, but it can gives good permutation that is not easily distinguishable. The basic idea is similar to permute a matrix by rows and columns as described below.

Algorithm

Let the size of the elements to be N, and m = floor(sqrt(N)). Assuming a "square matrix" N = m*m will make this method much clear.

In the first pass, you should store the pointers of elements that is separated by every m elements as p_0, p_1, p_2, ..., p_m. That is, p_0->next->...->next(m times) == p_1 should be true.
Permute each row
- For i = 0 to m do:
- Index all elements between p_i->next to p_(i+1)->next in the link list by an array of size O(m)
- Shuffle this array using standard method
- Relink the elements using this shuffled array
Permute each column.
- Initialize an array A to store pointers p_0, ..., p_m. It is used to traverse the columns
- For i = 0 to m do
- Index all elements pointed A[0], A[1], ..., A[m-1] in the link list by an array of size m
- Shuffle this array
- Relink the elements using this shuffled array
- Advance the pointer to next column A[i] := A[i]->next

Note that p_0 is an element point to the first element and the p_m point to the last element. Also, if N != m*m, you may use m+1 separation for some p_i instead. Now you get a "matrix" such that the p_i point to the start of each row.

Analysis and randomness

Space complexity: This algorithm need O(m) space to store the start of row. O(m) space to store the array and O(m) space to store the extra pointer during column permutation. Hence, time complexity is ~ O(3*sqrt(N)). For N = 1000000, it is around 3000 entries and 12 kB memory.
Time complexity: It is obviously O(N). It either walk through the "matrix" row by row or column by column
Randomness: The first thing to note is that each element can go to anywhere in the matrix by row and column permutation. It is very important that elements can go to anywhere in the linked list. Second, though it does not generate all permutation sequence, it does generate part of them. To find the number of permutation, we assume N=m*m, each row permutation has m! and there is m row, so we have (m!)^m. If column permutation is also include, it is exactly equal to (m!)^(2*m), so it is almost impossible to get the same sequence.

It is highly recommended to repeat the second and third step by at least one more time to get an more random sequence. Because it can suppress almost all the row and column correlation to its original location. It is also important when your list is not "square". Depends on your need, you may want to use even more repetition. The more repetition you use, the more permutation it can be and the more random it is. I remember that it is possible to generate uniform distribution for N=9 and I guess that it is possible to prove that as repetition tends to infinity, it is the same as the true uniform distribution.

Edit: The time and space complexity is tight bound and is almost the same in any situation. I think this space consumption can satisfy your need. If you have any doubt, you may try it in a small list and I think you will find it useful.

Upvotes: 0

salva

Reputation: 10244

O(NlogN) easy to implement solution that does not require extra storage:

Say you want to randomize L:

is L has 1 or 0 elements you are done
create two empty lists L1 and L2
loop over L destructively moving its elements to L1 or L2 choosing between the two at random.
repeat the process for L1 and L2 (recurse!)
join L1 and L2 into L3
return L3

Update

At step 3, L should be divided into equal sized (+-1) lists L1 and L2 in order to guaranty best case complexity (N*log N). That can be done adjusting the probability of one element going into L1 or L2 dynamically:

p(insert element into L1) = (1/2 * len0(L) - len(L1)) / len(L)

where

len(M) is the current number of elements in list M
len0(L) is the number of elements there was in L at the beginning of step 3

Upvotes: 0

phimuemue

Reputation: 35983

I've not tried it, but you could use a "randomized merge-sort".

To be more precise, you randomize the merge-routine. You do not merge the two sub-lists systematically, but you do it based on a coin toss (i.e. with probability 0.5 you select the first element of the first sublist, with probability 0.5 you select the first element of the right sublist).

This should run in O(n log n) and use O(1) space (if properly implemented).

Below you find a sample implementation in C you might adapt to your needs. Note that this implementation uses randomisation at two places: In splitList and in merge. However, you might choose just one of these two places. I'm not sure if the distribution is random (I'm almost sure it is not), but some test cases yielded decent results.

#include <stdio.h>
#include <stdlib.h>

#define N 40

typedef struct _node{
  int value;
  struct _node *next;
} node;

void splitList(node *x, node **leftList, node **rightList){
  int lr=0; // left-right-list-indicator
  *leftList = 0;
  *rightList = 0;
  while (x){
    node *xx = x->next;
    lr=rand()%2;
    if (lr==0){
      x->next = *leftList;
      *leftList = x;
    }
    else {
      x->next = *rightList;
      *rightList = x;
    }
    x=xx;
    lr=(lr+1)%2;
  }
}

void merge(node *left, node *right, node **result){
  *result = 0;
  while (left || right){
    if (!left){
      node *xx = right;
      while (right->next){
    right = right->next;
      }
      right->next = *result;
      *result = xx;
      return;
    }
    if (!right){
      node *xx = left;
      while (left->next){
    left = left->next;
      }
      left->next = *result;
      *result = xx;
      return;
    }
    if (rand()%2==0){
      node *xx = right->next;
      right->next = *result;
      *result = right;
      right = xx;
    }
    else {
      node *xx = left->next;
      left->next = *result;
      *result = left;
      left = xx;
    }
  }
}

void mergeRandomize(node **x){
  if ((!*x) || !(*x)->next){
    return;
  }
  node *left;
  node *right;
  splitList(*x, &left, &right);
  mergeRandomize(&left);
  mergeRandomize(&right);
  merge(left, right, &*x);
}

int main(int argc, char *argv[]) {
  srand(time(NULL));
  printf("Original Linked List\n");
  int i;
  node *x = (node*)malloc(sizeof(node));;
  node *root=x;
  x->value=0;
  for(i=1; i<N; ++i){
    node *xx;
    xx = (node*)malloc(sizeof(node));
    xx->value=i;
    xx->next=0;
    x->next = xx;
    x = xx;
  }
  x=root;
  do {
    printf ("%d, ", x->value);
    x=x->next;
  } while (x);

  x = root;
  node *left, *right;
  mergeRandomize(&x);
  if (!x){
    printf ("Error.\n");
    return -1;
  }
  printf ("\nNow randomized:\n");
  do {
    printf ("%d, ", x->value);
    x=x->next;
  } while (x);
  printf ("\n");
  return 0;
}

Upvotes: 6

Jake Stevens-Haas

Reputation: 1666

If you know both N and n, I think you can do it simply. It's fully random, too. You only iterate through the whole list once, and through the randomized part each time you add a node. I think that's O(n+NlogN) or O(n+N^2). I'm not sure. It's based upon updating the conditional probability that a node is selected for the random portion given what happened to previous nodes.

Determine the probability that a certain node will be selected for the random portion given what happened to previous nodes (p=(N-size)/(n-position) where size is number of nodes previously chosen and position is number of nodes previously considered)
If node is not selected for random part, move to step 4. If node is selected for the random part, randomly choose place in random part based upon the size so far (place=(random between 0 and 1) * size, size is again number of previous nodes).
Place the node where it needs to go, update the pointers. Increment size. Change to looking at the node that previously pointed at what you were just looking at and moved.
Increment position, look at the next node.

I don't know C, but I can give you the pseudocode. In this, I refer to the permutation as the first elements that are randomized.

integer size=0;         //size of permutation
integer position=0      //number of nodes you've traversed so far
Node    head=head of linked list        //this holds the node at the head of your linked list.
Node    current_node=head           //Starting at head, you'll move this down the list to check each node, whether you put it in the list.
Node    previous=head               //stores the previous node for changing pointers.  starts at head to avoid asking for the next field on a null node

While ((size not equal to N) or (current_node is not null)){            //iterating through the list until the permutation is full.  We should never pass the end of list, but just in case, I include that condition)

pperm=(N-size)/(n-position)          //probability that a selected node will be in the permutation.
if ([generate a random decimal between 0 and 1] < pperm)    //this decides whether or not the current node will go in the permutation

    if (j is not equal to 0){   //in case we are at start of list, there's no need to change the list       

        pfirst=1/(size+1)       //probability that, if you select a node to be in the permutation, that it will be first.  Since the permutation has
                    //zero elements at start, adding an element will make it the initial node of a permutation and percent chance=1.
        integer place_in_permutation = round down([generate a random decimal between 0 and 1]/pfirst)   //place in the permutation.  note that the head =0.
        previous.next=current_node.next

        if(place_in_permutation==0){            //if placing current node first, must change the head

            current_node.next=head          //set the current Node to point to the previous head
            head=current_node           //set the variable head to point to the current node

        }
        else{
            Node temp=head
            for (counter starts at zero. counter is less than place_in_permutation-1.  Each iteration, increment counter){

                counter=counter.next
            }   //at this time, temp should point to the node right before the insertion spot
            current_node.next=temp.next
            temp.next=current_node
        }
        current_node=previous
    }
    size++              //since we add one to the permutation, increase the size of the permutation
}
j++;
previous=current_node
current_node=current_node.next

}

You could probably increase the efficiency if you held on to the most recently added node in case you had to add one to the right of it.

Upvotes: 1

Potatoswatter

Reputation: 137780

First, get the length of the list and the last element. You say you already do a traversal before randomization, that would be a good time.

Then, turn it into a circular list by linking the first element to the last element. Get four pointers into the list by dividing the size by four and iterating through it for a second pass. (These pointers could also be obtained from the previous pass by incrementing once, twice, and three times per four iterations in the previous traversal.)

For the randomization pass, traverse again and swap pointers 0 and 2 and pointers 1 and 3 with 50% probability. (Do either both swap operations or neither; just one swap will split the list in two.)

Here is some example code. It looks like it could be a little more random, but I suppose a few more passes could do the trick. Anyway, analyzing the algorithm is more difficult than writing it :vP . Apologies for the lack of indentation; I just punched it into ideone in the browser.

http://ideone.com/9I7mx

#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;

struct list_node {
int v;
list_node *n;
list_node( int inv, list_node *inn )
: v( inv ), n( inn) {}
};

int main() {
srand( time(0) );

// initialize the list and 4 pointers at even intervals
list_node *n_first = new list_node( 0, 0 ), *n = n_first;
list_node *p[4];
p[0] = n_first;
for ( int i = 1; i < 20; ++ i ) {
n = new list_node( i, n );
if ( i % (20/4) == 0 ) p[ i / (20/4) ] = n;
}
// intervals must be coprime to list length!
p[2] = p[2]->n;
p[3] = p[3]->n;
// turn it into a circular list
n_first->n = n;

// swap the pointers around to reshape the circular list
// one swap cuts a circular list in two, or joins two circular lists
// so perform one cut and one join, effectively reordering elements.
for ( int i = 0; i < 20; ++ i ) {
list_node *p_old[4];
copy( p, p + 4, p_old );
p[0] = p[0]->n;
p[1] = p[1]->n;
p[2] = p[2]->n;
p[3] = p[3]->n;
if ( rand() % 2 ) {
swap( p_old[0]->n, p_old[2]->n );
swap( p_old[1]->n, p_old[3]->n );
}
}

// you might want to turn it back into a NULL-terminated list

// print results
for ( int i = 0; i < 20; ++ i ) {
cout << n->v << ", ";
n = n->n;
}
cout << '\n';
}

Upvotes: 2

sprite

Reputation: 3752

Similar to Vlad's answer, here is a slight improvement (statistically):

Indices in algorithm are 1 based.

Initialize lastR = -1
If N <= 1 go to step 6.
Randomize number r between 1 and N.

if r != N

4.1 Traverse the list to item r and its predecessor.

If lastR != -1
If r == lastR, your pointer for the of the r'th item predecessor is still there.
If r < lastR, traverse to it from the beginning of the list.
If r > lastR, traverse to it from the predecessor of the lastR'th item.

4.2 remove the r'th item from the list into a result list as the tail.

4.3 lastR = r

Decrease N by one and go to step 2.
link the tail of the result list to the head of the remaining input list. You now have the original list with the first N items permutated.

Since you do not have random access, this will reduce the traversing time you will need within the list (I assume that by half, so asymptotically, you won't gain anything).

Upvotes: 0

Vlad

Reputation: 35584

For the case when N is really big (so it doesn't fit your memory), you can do the following (a sort of Knuth's 3.4.2P):

j = N
k = random between 1 and j
traverse the input list, find k-th item and output it; remove the said item from the sequence (or mark it somehow so that you won't consider it at the next traversal)
decrease j and return to 2 unless j==0
output the rest of the list

Beware that this is O(N^2), unless you can ensure random access in the step 3.

In case the N is relatively small, so that N items fit into the memory, just load them into array and shuffle, like @Mitch proposes.

Upvotes: 1

Peter Alexander

Reputation: 54270

I don't believe there's any efficient way to randomly shuffle singly-linked lists without an intermediate data structure. I'd just read the first N elements into an array, perform a Fisher-Yates shuffle, then reconstruct those first N elements into the singly-linked list.

Upvotes: 4

Mitch Wheat

Reputation: 300539

Convert to an array, use a Fisher-Yates shuffle, and convert back to a list.

Upvotes: 4

Randomly permute N first elements of a singly linked list

Answers (11)

Algorithm

Analysis and randomness

Related Questions