Reputation: 4093
I have recently been reading about a family of automatic memory management techniques that rely on storing information in the pointer returned by the allocator, i.e. few bits of header e.g. to differentiate between pointers or to store thread-related information (note that I'm not talking about limited-field reference counting here, only immutable information).
I'd like to toy with these techniques. Now, to implement them, I need to be able to return pointers with a specific shape from my allocator. I suppose I could play with the least weight bits but this would require padding that looks extremely memory consuming, so I believe that I should play with the heaviest bits. However, I have no good idea on how to do this. Is there a way for me to, call malloc
or malloc_create_zone
or some related function and request a pointer that always starts with the given bits?
Thanks everyone!
Upvotes: 1
Views: 181
Reputation: 31053
The amount of information you can actually store in a pointer is pretty limited (typically one or two bits per pointer). And every attempt to dereference the pointer has to first mask out the magic information. The technique is often called tagging, BTW.
#define TAG_MASK 0x3
#define CONS_TAG 0x1
#define STRING_TAG 0x2
#define NUMBER_TAG 0x3
typedef uintptr_t value_t;
typedef struct cons {
value_t car;
value_t cdr;
} cons_t;
value_t
create_cons(value_t t1, value_t t2)
{
cons_t* pair = malloc(sizeof(cons_t));
value_t addr = (value_t)pair;
pair->car = t1;
pair->cdr = t2;
return addr | CONS_TAG;
}
value_t
car_of_cons(value_t v)
{
if ((v % TAG_MASK) != CONS_TAG) error("wrong type of argument");
return ((cons_t*) (v & ~TAG_MASK))->car;
}
One advantage of this technique is, that you can directly infer the type of the object from the pointer itself. You don't need to dereference it (say, in order to read a special type
field or similar). Many language implementations using this scheme also have a special tag combination for "immediate" numbers and other small values, which can be represented direcly using the "pointer".
The disadvatage is, that the amount of information, which can be stored, is pretty limited. Also, as the example code shows, you have to be aware of the tagging in every access to the object, and need to "untag" the pointer before actually using it.
The use of the least significant bits for tagging stemms from the observation, that on most platforms, all pointer to malloc
ed memory is actually aligned on a non-byte boundary (usually 8 bytes), so the least significant bits are always zero.
Upvotes: 2