Reputation: 3436
I have a performance related question. Let's say, I have some kind of a struct, like this one:
typedef struct
{
uint8_t FirstSofRec :1; //SOF byte
uint8_t SecSofRec :1; //SOF byte
uint8_t RecPending :1; //Pending flag
uint8_t Timeout :1; //Timeout flag
uint8_t RecCompleted :1; //Recievein completed flag
uint8_t CrcMatch :1; //CRC match flag
uint8_t DataLength :2; //Data length field (1 - 8)
}Reciever_flags_t;
typedef struct
{
Reciever_flags_t flags;
uint8_t SofFrame[2];
uint8_t MsgBuffer[MAX_REC_BUFF_SIZE];
uint8_t CRC;
}Reciever_struct_t;
What will be the quickest (in performance meaning, writing embedded code) way to copy content of one structure to another?
I have the following options:
Direct pointer use:
Reciever_struct_t BASE;
Reciever_struct_t COPY;
Reciever_struct_t *PtToBase = &BASE;
Reciever_struct_t *PtToCopy = ©
*PtToCopy = *PtToBase
Or using let's say uint8 pointer and copy it byte after byte (assuming that there is no pending in structure, and we know it's size)
Reciever_struct_t BASE;
Reciever_struct_t COPY;
uint8_t *CpyPtrBase = (uint8_t *)&BASE;
uint8_t *CpyPtrCopy = (uint8_t *)©
while(SizeIsNotZero--)
{
*CpyPtrCopy++ = *CpyPtrBase++
}
Main topic of this question is not about details like malloc and ect, just about idea. Thanks in advice, best regards!
Upvotes: 3
Views: 192
Reputation: 93476
The simple structure assignment:
COPY = BASE ;
or
*PtToCopy = *PtToBase ;
will be provided by compiler generated code and will therefore be optimised for the target and for the compiler options you set.
A high-level coded byte-by-byte copy may be as fast, but is unlikley to be faster. On all but an 8 bit architecture it is likely to be slower.
A better method than the byte copy is:
memcpy( PtToCopy, PtToBase, sizeof(*PtToCopy) ) ;
or just:
memcpy( ©, &BASE, sizeof(COPY) ) ;
but that relies on the implementation of the library function memcpy()
which may or may not be the same as that which the compiler will generate for an assignment, but is also likely to be optimised for the target, but won't account for compiler settings as it is pre-compiled.
If you really need to know, benchmark it on your target, or inspect the assembler code generated by your compiler, but I suspect that this is a "micro-optimisation" and you are likely to get better performance gains by considering your code design at a higher more holistic or abstract level. Larger performance gains tend to be derived from devising efficient data structures and ways to avoid copying data altogether.
Upvotes: 8
Reputation: 213892
The former is likely more efficient, as the compiler can then copy with the largest data type possible for the specific CPU. The structs will have struct padding on platforms where alignment matters, so the former method can take advantage of that.
The latter may or may not be as efficient, depending on how good the compiler is at optimizing.
Though if you are concerned about performance, the wisest is probably to use memcpy(), because it will be heavily optimized for the particular system.
Only way to tell for sure is to benchmark.
Upvotes: 2
Reputation: 5721
The former way will be faster as the compiler will have enough information to make it as fast as possible (not the case with the implicit loop). Alternatively, you can use memcpy but I doubt it will be faster.
Upvotes: 0