anon
anon

Reputation:

Fastest way to remove huge number of elements from an array in C

I have dynamic array that contains thousands of elements or even more, in order not to consume a large size of memory, I can remove unwanted elements from it (i.e elements have been used and no need for them any more) so from the beginning I can allocate a smaller memory size by estimating the maximum required size after removing the elements each time.

I use this way but it takes a very very long time to finish, sometime takes 30 minutes!

int x, y ;
for (x = 0 ; x<number_of_elements_to_remove ; x++){
    for (y = 0 ; y<size_of_array; y++ ){
            array[y] = array[y+1];
    }
}

Is there a faster way than this?

Upvotes: 7

Views: 1234

Answers (4)

KRoy
KRoy

Reputation: 1406

Here is the code to defragment the array.

int sparse_to_compact(int*arr, int total, int*is_valid) {
        int i = 0;
        int last = total - 1;
        // trim the last invalid elements
        for(; last >= 0 && !is_valid[last]; last--); // trim invalid elements from last

        // now we keep swapping the invalid with last valid element
        for(i=0; i < last; i++) {
                if(is_valid[i])
                        continue;
                arr[i] = arr[last]; // swap invalid with the last valid
                last--;
                for(; last >= 0 && !is_valid[last]; last--); // trim invalid elements
        }
        return last+1; // return the compact length of the array
}

I copied the code from this answer.

I think more efficient way is to use a link-list of buckets. And the buckets are managed by bit-string memory manager. It is like the following,

struct elem {
     uint32_t index; /* helper to locate it's position in the array */
     int x; /* The content/object kept in the array */
}

Suppose, our array content is int and it is encapsulated in a structure named struct elem.

enum {
     MAX_BUCKET_SIZE = 1024,
     MAX_BITMASK_SIZE = (MAX_BUCKET_SIZE + 63) >> 6,
};

struct bucket {
    struct bucket*next; /* link to the next bucket */
    uint64_t usage[MAX_BITMASK_SIZE]; /* track memory usage */
    struct elem[MAX_BUCKET_SIZE]; /* the array */
};

A bucket is defined as an array of struct elem and usage mask.

struct bucket_list {
    struct bucket*head; /* dynamically allocated bucket */
}container;

And a bucket list is a linked list containing all the buckets.

So we need to write memory manager code.

At first we need new bucket to be allocated when needed.

struct bucket*bk = get_empty_bucket(&container);
if(!bk) { /* no empty bucket */
    /* allocate a bucket */
    struct bucket*bk = (struct bucket*)malloc(sizeof(struct bucket));
    assert(bk);
    /* cleanup the usage flag */
    memset(bk->usage, 0, sizeof(bk->usage));
    /* link the bucket */
    bk->next = container.head;
    container.head = bk; 
}

Now as we have the bucket we need to set the value in the array when needed.

for(i = 0; i < MAX_BITMASK_SIZE; i++) {
    uint64_t bits = ~bk.usage[i];
    if(!bits) continue; /* no space */
    /* get the next empty position */
    int bit_index = _builtin_ctzl(bits);
    int index = (i<<6)+bit_index;
    /* set the array value */
    bk->elem[index].index = index;
    bk->elem[index].x = 34/* my value */;
    bk.usage[i] |= 1<<bit_index; /* mark/flag the array element as used */
}

Deleting the array elements is easy as to mark them unused. Now when all the elements in a bucket is unused we can delete the bucket from the link-list.

We can sometimes defragment buckets or optimize them to fit in smaller space. Otherwise when we assign new elements we can select more crowded buckets over less crowded one. When we delete we can swap the element of less crowded one into more crowded one.

It is possible to delete elements of array in efficient way,

int remove_element(int*from, int total, int index) {
        if(index != (total-1))
                from[index] = from[total-1];
        return total; // **DO NOT DECREASE** the total here
}

It is done by swapping the element with the last value.

Upvotes: 0

Rahaf Jawish
Rahaf Jawish

Reputation: 46

as I noticed you want to only delete elements from the start of the array, try this:

  int x;
        for(x = 0 ; x< size_of_array - number_of_elements_to_remove; x++)
           array[x] = array[number_of_elements_to_remove + x];

this way you're using one for loop which reduces the complexity alot

Upvotes: 1

Hagen von Eitzen
Hagen von Eitzen

Reputation: 2177

It seems you essentially do

int y;
for (y = 0; y<size_of_array; y++){
   array[y] = array[y+numbre_of_elements_to_remove];
}

The above should be faster, but there are still some caveats / problems with your code (e.g., access beyond the end od the array).

Upvotes: 0

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726479

Instead of removing elements one at a time, with two loops making for an O(n2) solution, you can make a single loop, with a single read and a single write index. Go through the array, copying items as you go:

int rd = 0, wr = 0;
while (rd != size_of_array) {
    if (keep_element(array[rd])) {
        array[wr++] = array[rd];
    }
    rd++;
}

At the end of the loop wr is the number of elements kept in the array.

Upvotes: 3

Related Questions