JohnDoe
JohnDoe

Reputation: 915

Fastest way to copy x elements of an array starting from an specific index

I'm a beginner with C and am playing with pointer and arrays.

Let's say I have the following function

GetArrayValues(uint8**pValues, uint32 *pNumValues);

where:

**pValues points to the first element of the array

*pNumValues points to the number of values in the array

Let's suppose the array contains 100 uint8 elements.

Now each time I call the function GetArrayValues() I would like to copy all 30 elements starting from index 20 into a new array:

uint8 myInternalArray[30]

What is the fastest way to do this?

Upvotes: 1

Views: 430

Answers (2)

0___________
0___________

Reputation: 67546

you want something like this

void *arrcpy(const void *src, void *dest, size_t pos, size_t len, size_t elemsize)
{
    const unsigned char *csrc = src;

    memcpy(dest, csrc + pos * elemsize, len * elemsize);
}

Upvotes: 1

Brendan
Brendan

Reputation: 37232

In general; the fastest way to do something is to find a way to avoid doing it and then do nothing. For example, if the array and its copy aren't going to be modified, you can just use different pointers to the same data without copying. In a similar way, if something is rarely modified, you might be able to avoid the copy initially and then (rarely) do the copy if something actually is modified.

If you can't do that; then the second fastest way is to minimise the amount of work you don't avoid. For example; copying data involves reading the original data from one place and writing it to another place; but maybe you can create the copy when you create the original to avoid reading from the original later.

If you can't do that; then the third fastest way is using inline assembly or intrinsics. For example; for modern 80x86 CPU (with AVX2), if you copy 2 extra "unused" bytes then you can probably do it with 2 instructions (load 32 bytes into a register, then store the register elsewhere). For an extreme case, maybe you can keep the copy in a register and do it with a single load.

If you can't do that; then the fourth fastest way is probably using memcpy(). Note that the compiler might be able to optimize the code so that memcpy() becomes as good as it could have been with inline assembly or intrinsics; however, in this case you'd have to make sure the compiler knows that it can modify the 2 extra "unused" bytes - otherwise it'll have use a slower approach (to avoid corrupting something else that might be after the end of the 30-byte copy).

Upvotes: 0

Related Questions