Reputation: 30492
I have to permute N first elements of a singly linked list of length n, randomly. Each element is defined as:
typedef struct E_s
{
struct E_s *next;
}E_t;
I have a root element and I can traverse the whole linked list of size n. What is the most efficient technique to permute only N first elements (starting from root) randomly?
So, given a->b->c->d->e->f->...x->y->z I need to make smth. like f->a->e->c->b->...x->y->z
My specific case:
UPDATE: I found this paper. It states it presents an algorithm of O(log n) stack space and expected O(n log n) time.
Upvotes: 19
Views: 2322
Reputation: 2613
If both the following conditions are true:
Then you can choose a sufficiently large set of specific permutations, defined at programming time, write a code to write the code that implements each, and then iterate over them at runtime.
Upvotes: 0
Reputation: 10244
The list randomizer below has complexity O(N*log N) and O(1) memory usage.
It is based on the recursive algorithm described on my other post modified to be iterative instead of recursive in order to eliminate the O(logN) memory usage.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct node {
struct node *next;
char *str;
} node;
unsigned int
next_power_of_two(unsigned int v) {
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return v + 1;
}
void
dump_list(node *l) {
printf("list:");
for (; l; l = l->next) printf(" %s", l->str);
printf("\n");
}
node *
array_to_list(unsigned int len, char *str[]) {
unsigned int i;
node *list;
node **last = &list;
for (i = 0; i < len; i++) {
node *n = malloc(sizeof(node));
n->str = str[i];
*last = n;
last = &n->next;
}
*last = NULL;
return list;
}
node **
reorder_list(node **last, unsigned int po2, unsigned int len) {
node *l = *last;
node **last_a = last;
node *b = NULL;
node **last_b = &b;
unsigned int len_a = 0;
unsigned int i;
for (i = len; i; i--) {
double pa = (1.0 + RAND_MAX) * (po2 - len_a) / i;
unsigned int r = rand();
if (r < pa) {
*last_a = l;
last_a = &l->next;
len_a++;
}
else {
*last_b = l;
last_b = &l->next;
}
l = l->next;
}
*last_b = l;
*last_a = b;
return last_b;
}
unsigned int
min(unsigned int a, unsigned int b) {
return (a > b ? b : a);
}
randomize_list(node **l, unsigned int len) {
unsigned int po2 = next_power_of_two(len);
for (; po2 > 1; po2 >>= 1) {
unsigned int j;
node **last = l;
for (j = 0; j < len; j += po2)
last = reorder_list(last, po2 >> 1, min(po2, len - j));
}
}
int
main(int len, char *str[]) {
if (len > 1) {
node *l;
len--; str++; /* skip program name */
l = array_to_list(len, str);
randomize_list(&l, len);
dump_list(l);
}
return 0;
}
/* try as: a.out list of words foo bar doz li 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
*/
Note that this version of the algorithm is completely cache unfriendly, the recursive version would probably perform much better!
Upvotes: 0
Reputation: 2200
There is an algorithm takes O(sqrt(N))
space and O(N)
time, for a singly linked list.
It does not generate a uniform distribution over all permutation sequence, but it can gives good permutation that is not easily distinguishable. The basic idea is similar to permute a matrix by rows and columns as described below.
Let the size of the elements to be N
, and m = floor(sqrt(N))
. Assuming a "square matrix" N = m*m
will make this method much clear.
In the first pass, you should store the pointers of elements that is separated by every m
elements as p_0, p_1, p_2, ..., p_m
. That is, p_0->next->...->next(m times) == p_1
should be true.
Permute each row
p_i->next
to p_(i+1)->next
in the link list by an array of size O(m)
Permute each column.
A
to store pointers p_0, ..., p_m
. It is used to traverse the columnsA[0], A[1], ..., A[m-1]
in the link list by an array of size m
A[i] := A[i]->next
Note that p_0
is an element point to the first element and the p_m
point to the last element. Also, if N != m*m
, you may use m+1
separation for some p_i
instead. Now you get a "matrix" such that the p_i
point to the start of each row.
Space complexity: This algorithm need O(m)
space to store the start of row. O(m)
space to store the array and O(m)
space to store the extra pointer during column permutation. Hence, time complexity is ~ O(3*sqrt(N)). For N = 1000000
, it is around 3000 entries and 12 kB memory.
Time complexity: It is obviously O(N)
. It either walk through the "matrix" row by row or column by column
Randomness: The first thing to note is that each element can go to anywhere in the matrix by row and column permutation. It is very important that elements can go to anywhere in the linked list. Second, though it does not generate all permutation sequence, it does generate part of them. To find the number of permutation, we assume N=m*m
, each row permutation has m!
and there is m row, so we have (m!)^m
. If column permutation is also include, it is exactly equal to (m!)^(2*m)
, so it is almost impossible to get the same sequence.
It is highly recommended to repeat the second and third step by at least one more time to get an more random sequence. Because it can suppress almost all the row and column correlation to its original location. It is also important when your list is not "square". Depends on your need, you may want to use even more repetition. The more repetition you use, the more permutation it can be and the more random it is. I remember that it is possible to generate uniform distribution for N=9
and I guess that it is possible to prove that as repetition tends to infinity, it is the same as the true uniform distribution.
Edit: The time and space complexity is tight bound and is almost the same in any situation. I think this space consumption can satisfy your need. If you have any doubt, you may try it in a small list and I think you will find it useful.
Upvotes: 0
Reputation: 10244
O(NlogN) easy to implement solution that does not require extra storage:
Say you want to randomize L:
is L has 1 or 0 elements you are done
create two empty lists L1 and L2
loop over L destructively moving its elements to L1 or L2 choosing between the two at random.
repeat the process for L1 and L2 (recurse!)
join L1 and L2 into L3
return L3
Update
At step 3, L should be divided into equal sized (+-1) lists L1 and L2 in order to guaranty best case complexity (N*log N). That can be done adjusting the probability of one element going into L1 or L2 dynamically:
p(insert element into L1) = (1/2 * len0(L) - len(L1)) / len(L)
where
len(M) is the current number of elements in list M
len0(L) is the number of elements there was in L at the beginning of step 3
Upvotes: 0
Reputation: 35983
I've not tried it, but you could use a "randomized merge-sort".
To be more precise, you randomize the merge
-routine. You do not merge the two sub-lists systematically, but you do it based on a coin toss (i.e. with probability 0.5 you select the first element of the first sublist, with probability 0.5 you select the first element of the right sublist).
This should run in O(n log n)
and use O(1)
space (if properly implemented).
Below you find a sample implementation in C you might adapt to your needs. Note that this implementation uses randomisation at two places: In splitList
and in merge
. However, you might choose just one of these two places. I'm not sure if the distribution is random (I'm almost sure it is not), but some test cases yielded decent results.
#include <stdio.h>
#include <stdlib.h>
#define N 40
typedef struct _node{
int value;
struct _node *next;
} node;
void splitList(node *x, node **leftList, node **rightList){
int lr=0; // left-right-list-indicator
*leftList = 0;
*rightList = 0;
while (x){
node *xx = x->next;
lr=rand()%2;
if (lr==0){
x->next = *leftList;
*leftList = x;
}
else {
x->next = *rightList;
*rightList = x;
}
x=xx;
lr=(lr+1)%2;
}
}
void merge(node *left, node *right, node **result){
*result = 0;
while (left || right){
if (!left){
node *xx = right;
while (right->next){
right = right->next;
}
right->next = *result;
*result = xx;
return;
}
if (!right){
node *xx = left;
while (left->next){
left = left->next;
}
left->next = *result;
*result = xx;
return;
}
if (rand()%2==0){
node *xx = right->next;
right->next = *result;
*result = right;
right = xx;
}
else {
node *xx = left->next;
left->next = *result;
*result = left;
left = xx;
}
}
}
void mergeRandomize(node **x){
if ((!*x) || !(*x)->next){
return;
}
node *left;
node *right;
splitList(*x, &left, &right);
mergeRandomize(&left);
mergeRandomize(&right);
merge(left, right, &*x);
}
int main(int argc, char *argv[]) {
srand(time(NULL));
printf("Original Linked List\n");
int i;
node *x = (node*)malloc(sizeof(node));;
node *root=x;
x->value=0;
for(i=1; i<N; ++i){
node *xx;
xx = (node*)malloc(sizeof(node));
xx->value=i;
xx->next=0;
x->next = xx;
x = xx;
}
x=root;
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
x = root;
node *left, *right;
mergeRandomize(&x);
if (!x){
printf ("Error.\n");
return -1;
}
printf ("\nNow randomized:\n");
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
printf ("\n");
return 0;
}
Upvotes: 6
Reputation: 1666
If you know both N and n, I think you can do it simply. It's fully random, too. You only iterate through the whole list once, and through the randomized part each time you add a node. I think that's O(n+NlogN) or O(n+N^2). I'm not sure. It's based upon updating the conditional probability that a node is selected for the random portion given what happened to previous nodes.
I don't know C, but I can give you the pseudocode. In this, I refer to the permutation as the first elements that are randomized.
integer size=0; //size of permutation
integer position=0 //number of nodes you've traversed so far
Node head=head of linked list //this holds the node at the head of your linked list.
Node current_node=head //Starting at head, you'll move this down the list to check each node, whether you put it in the list.
Node previous=head //stores the previous node for changing pointers. starts at head to avoid asking for the next field on a null node
While ((size not equal to N) or (current_node is not null)){ //iterating through the list until the permutation is full. We should never pass the end of list, but just in case, I include that condition)
pperm=(N-size)/(n-position) //probability that a selected node will be in the permutation.
if ([generate a random decimal between 0 and 1] < pperm) //this decides whether or not the current node will go in the permutation
if (j is not equal to 0){ //in case we are at start of list, there's no need to change the list
pfirst=1/(size+1) //probability that, if you select a node to be in the permutation, that it will be first. Since the permutation has
//zero elements at start, adding an element will make it the initial node of a permutation and percent chance=1.
integer place_in_permutation = round down([generate a random decimal between 0 and 1]/pfirst) //place in the permutation. note that the head =0.
previous.next=current_node.next
if(place_in_permutation==0){ //if placing current node first, must change the head
current_node.next=head //set the current Node to point to the previous head
head=current_node //set the variable head to point to the current node
}
else{
Node temp=head
for (counter starts at zero. counter is less than place_in_permutation-1. Each iteration, increment counter){
counter=counter.next
} //at this time, temp should point to the node right before the insertion spot
current_node.next=temp.next
temp.next=current_node
}
current_node=previous
}
size++ //since we add one to the permutation, increase the size of the permutation
}
j++;
previous=current_node
current_node=current_node.next
}
You could probably increase the efficiency if you held on to the most recently added node in case you had to add one to the right of it.
Upvotes: 1
Reputation: 137780
First, get the length of the list and the last element. You say you already do a traversal before randomization, that would be a good time.
Then, turn it into a circular list by linking the first element to the last element. Get four pointers into the list by dividing the size by four and iterating through it for a second pass. (These pointers could also be obtained from the previous pass by incrementing once, twice, and three times per four iterations in the previous traversal.)
For the randomization pass, traverse again and swap pointers 0 and 2 and pointers 1 and 3 with 50% probability. (Do either both swap operations or neither; just one swap will split the list in two.)
Here is some example code. It looks like it could be a little more random, but I suppose a few more passes could do the trick. Anyway, analyzing the algorithm is more difficult than writing it :vP . Apologies for the lack of indentation; I just punched it into ideone in the browser.
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
struct list_node {
int v;
list_node *n;
list_node( int inv, list_node *inn )
: v( inv ), n( inn) {}
};
int main() {
srand( time(0) );
// initialize the list and 4 pointers at even intervals
list_node *n_first = new list_node( 0, 0 ), *n = n_first;
list_node *p[4];
p[0] = n_first;
for ( int i = 1; i < 20; ++ i ) {
n = new list_node( i, n );
if ( i % (20/4) == 0 ) p[ i / (20/4) ] = n;
}
// intervals must be coprime to list length!
p[2] = p[2]->n;
p[3] = p[3]->n;
// turn it into a circular list
n_first->n = n;
// swap the pointers around to reshape the circular list
// one swap cuts a circular list in two, or joins two circular lists
// so perform one cut and one join, effectively reordering elements.
for ( int i = 0; i < 20; ++ i ) {
list_node *p_old[4];
copy( p, p + 4, p_old );
p[0] = p[0]->n;
p[1] = p[1]->n;
p[2] = p[2]->n;
p[3] = p[3]->n;
if ( rand() % 2 ) {
swap( p_old[0]->n, p_old[2]->n );
swap( p_old[1]->n, p_old[3]->n );
}
}
// you might want to turn it back into a NULL-terminated list
// print results
for ( int i = 0; i < 20; ++ i ) {
cout << n->v << ", ";
n = n->n;
}
cout << '\n';
}
Upvotes: 2
Reputation: 3752
Similar to Vlad's answer, here is a slight improvement (statistically):
Indices in algorithm are 1 based.
if r != N
4.1 Traverse the list to item r and its predecessor.
If lastR != -1
If r == lastR, your pointer for the of the r'th item predecessor is still there.
If r < lastR, traverse to it from the beginning of the list.
If r > lastR, traverse to it from the predecessor of the lastR'th item.
4.2 remove the r'th item from the list into a result list as the tail.
4.3 lastR = r
Since you do not have random access, this will reduce the traversing time you will need within the list (I assume that by half, so asymptotically, you won't gain anything).
Upvotes: 0
Reputation: 35584
For the case when N is really big (so it doesn't fit your memory), you can do the following (a sort of Knuth's 3.4.2P):
Beware that this is O(N^2), unless you can ensure random access in the step 3.
In case the N is relatively small, so that N items fit into the memory, just load them into array and shuffle, like @Mitch proposes.
Upvotes: 1
Reputation: 54270
I don't believe there's any efficient way to randomly shuffle singly-linked lists without an intermediate data structure. I'd just read the first N elements into an array, perform a Fisher-Yates shuffle, then reconstruct those first N elements into the singly-linked list.
Upvotes: 4
Reputation: 300539
Convert to an array, use a Fisher-Yates shuffle, and convert back to a list.
Upvotes: 4