Lapsio
Lapsio

Reputation: 7074

Using memcpy() to move tail of buffer to its beginning? (overlap)

I have binary file read buffer which reads structures of variable length. Near the end of buffer there will always be incomplete struct. I want to move such tail of buffer to its beginning and then read buffer_size - tail_len bytes during next file read. Something like this:

char[8192] buf;
cur = 0, rcur = 0;
while(1){
  read("file", &buf[rcur], 8192-rcur);
  while (cur + sizeof(mystruct) < 8192){
    mystruct_ptr = &buf[cur];
    if (mystruct_prt->tailsize + cur >= 8192) break; //incomplete
    //do stuff
    cur += sizeof(mystruct) + mystruct_ptr->tailsize;
  }
  memcpy(buf,&buf[cur],8192-cur);
  rcur=8192-cur;
  cur = 0;
}

It should be okay if tail is small and buffer is big because then memcpy most likely won't overlap copied memory segment during single copy iteration. However it sounds slightly risky when tail becomes big - bigger than 50% of buffer.

If buffer is really huge and tail is also huge then it still should be okay since there's physical limit of how much data can be copied in single operation which if I remember correctly is 512 bytes for modern x86_64 CPUs using vector units. I thought about adding condition that checks length of tail and if it's too big comparing to size of buffer, performs naive byte-by-byte copy but question is:

How big is too big to consider such overlapping memcpy more or less safe. tail > buffer size - 2kb?

Upvotes: 1

Views: 386

Answers (1)

John Bollinger
John Bollinger

Reputation: 180388

Per the standard, memcpy() has undefined behavior if the source and destination regions overlap. It doesn't matter how big the regions are or how much overlap there is. Undefined behavior cannot ever be considered safe.

If you are writing to a particular implementation, and that implementation defines behavior for some such copying, and you don't care about portability, then you can rely on your implementation's specific behavior in this regard. But I recommend not. That would be a nasty bug waiting to bite people who decide to use the code with some other implementation after all. Maybe even future you.

And in this particular case, having the alternative of using memmove(), which is dedicated to this exact purpose, makes gambling with memcpy() utterly reckless.

Upvotes: 2

Related Questions