mtraceur
mtraceur

Reputation: 3736

Copy Arbitrary Type in C Without Dynamic Memory Allocation

The Question:

I think I have figured out a way that, near as I can tell, allows you to write completely type-agnostic code that makes a copy of a variable of arbitrary type on the "stack" (in quotes because C standard does not actually require there to be a stack, so what I really mean is that it's copied with the auto storage class in local scope). Here it is:

/* Save/duplicate thingToCopy */
char copyPtr[sizeof(thingToCopy)];
memcpy(copyPtr, &thingToCopy, sizeof(thingToCopy));

/* modify the thingToCopy variable to do some work; do NOT do operations directly on the data in copyPtr, that's just a "storage bin". */

/* Restore old value of thingToCopy */
memcpy(&thingToCopy, copyPtr, sizeof(thingToCopy));

From my limited testing it works and near as I can tell it should work on all standards-compliant C implementations, but just in case I missed something, I'd like to know:

*GCC 4.6.1 on my armel v7 test device, with -O3 optimization, produced identical code to regular code using normal assignments to temporary variables, but it could be that my test cases were just simple enough that it was able to figure it out, and that it would get confused if this technique were used more generally.

As a bonus passing interest, I'm curious if this would break in mostly-C-compatible languages (the ones I know of are C++, Objective-C, D, and maybe C#, though mentions of others are welcome too).

Rationale:

This is why I think the above works, in case you find it helpful to know where I'm coming from in order to explain any mistakes I may have made:

The C standard's "byte" (in the traditional sense of "smallest addressable unit of memory", not in the modernized "8 bits" meaning) is the char type - the sizeof operator produces numbers in units of char. So we can get exactly the smallest size of storage (that we can work with in C) needed for an arbitrary variable's type by using the sizeof operator on that variable.

The C standard guarantees that pretty all pointer types can be converted implicitly into a void * (but with a change of representation if their representation is different (but incidentally, the C standard guarantees that void * and char * have identical representations)).

The "name" of an array of a given type, and a pointer to that same type, can basically be treated identically as far as the syntax is concerned.

The sizeof operator is figured out at compile-time, so we can do char foo[sizeof(bar)] without depending on the effectively non-portable VLAs.

Therefore, we should be able to declare an array of "chars" that is the minimum size necessary to hold a given type.

Thus we should be able to pass the address of the variable to be copied, and name of the array, to memcpy (as I understand it, the array name is implicitly used as a char * to the first element of the array). Since any pointer can be implicitly converted to a void * (with change of representation is necessary), this works.

The memcpy should make a bitwise copy of the variable we are copying to the array. Regardless of what the type is, any padding bits involved, etc, the sizeof guarantees we'll grab all the bits that make up the type, including padding.

Since we can't explicitly use/declare the type of the variable we just copied, and because some architectures might have alignment requirements for various types that this hack might violate some of the time, we can't use this copy directly - we'd have to memcpy it back into the variable we got it from, or one of the same type, in order to make use of it. But once we copy it back, we have an exact copy of what we put there in the first place. Essentially, we are freeing the variable itself to be used as scratch space.

Motivation (or, "Dear God Why!?!"):

I like to write type-independent code when useful, and yet I also enjoy coding in C, and combining the two largely comes down to writing the generic code in function-like macros (you can then re-claim type-checking by making wrapper function definitions which call the function-like macro). Think of it like really crude templates in C.

As I've done this, I've run into situations where I needed an additional variable of scratch space, but, given the lack of a portable typeof() operator, I cannot declare any temporary variables of a matching type in such "generic macro" snippets of code. This is the closest thing to a truly portable solution that I've found.

Since we can do this trick multiple times (large enough char array that we can fit several copies, or several char arrays big enough to fit one), as long as we can keep our memcpy calls and copy pointer names straight, it's functionally like having an arbitrary number of temporary variables of the copied type, while being able to keep the generic code type-agnostic.

P.S. To slightly deflect the likely-inevitable rain of judgement, I'd like to say that I do recognize that this is seriously convoluted, and I would only reserve this in practice for very well-tested library code where it significantly added usefulness, not something I would regularly deploy.

Upvotes: 2

Views: 1096

Answers (2)

Cyan
Cyan

Reputation: 13968

Yes, it works. Yes, it is C89 standard. Yes, it is convoluted.

Minor improvement

A table of bytes char[] can start at any position in memory. Depending on the content of your thingToCopy, and depending on CPU, this can result in sub-optimal copy performance.

Should speed matter (since it may not if this operation is rare), you may prefer to align your table, using int, long long or size_t units instead.

Major limitation

Your proposition only works if you know the size of thingToCopy. This is a major issue : that means your compiler needs to know what thingToCopy is at compilation type (hence, it cannot be an incomplete type).

Hence, the following sentence is troubling :

Since we can't explicitly use/declare the type of the variable we just copied

No way. In order to compile char copyPtr[sizeof(thingToCopy)];, the compiler must know what thingToCopy is, hence it must have access to its type !

If you know it, you can simply do :

thingToCopy_t save;
save = thingToCopy;
/* do some stuff with thingToCopy */
thingToCopy =  save;

which is clearer to read, and even better from an alignment perspective.

Upvotes: 3

logist
logist

Reputation: 23

It would be bad to use your code on an object containing a pointer (except const pointer to const). Someone might modify the pointed-to data, or the pointer itself (e.g. realloc). This would leave your copy of the object in an unexpected or even invalid state.

Generic programming is one of the main driving forces behind C++. Others have tried to do generic programming in C using macros and casts. It's OK for small examples, but doesn't scale well. The compiler can't catch bugs for you when you use those techniques.

Upvotes: 1

Related Questions