RaphaelH
RaphaelH

Reputation: 2184

Relative pointers in memory mapped file using C

Is it possible to use a structure with a pointer to another structure inside a memory mapped file instead of storing the offset in some integral type and calculate the pointer?

e.g. given following struct:

typedef struct _myStruct_t {
  int number;
  struct _myStruct_t *next;
} myStruct_t;
myStruct_t* first = (myStruct_t*)mapViewHandle;
myStruct_t* next = first->next;

instead of this:

typedef struct _myStruct_t {
  int number;
  int next;
} myStruct_t;
myStruct_t* first = (myStruct_t*)mappedFileHandle;
myStruct_t* next = (myStruct_t*)(mappedFileHandle+first->next);

I read about '__based' keyword, but this is Microsoft specific and therefore Windows-bound.

Looking for something working with GCC compiler.

Upvotes: 6

Views: 4431

Answers (3)

Codemeister
Codemeister

Reputation: 117

C++: It is very doable and portable (the code, but maybe not the data). It was a while ago, but I created a template for a self-relative pointer classes. I had tree structures inside blocks of memory that might move. Internally, the class had a single intptr_t, but = * . -> operators were overloaded so it appeared like a regular pointer. Handling null took some attention. I also did versions using int, short and not very useful char for space-saving pointers that were unable to point far away (outside memory block).

In C you could use macros to wrap get and set

// typedef OBJ { int p; } OBJ;
#define OBJPTR(P) ((OBJ*)((P)?(int)&(P)+(P):0))
#define SETOBJPTR(P,V) ((P)=(V)?(int)(V)-(int)&(P):0)

The above C macros are for self-relative pointers that can be slightly more efficient than based pointers. Here is a working example of a tree in a small block of relocatable memory using 2-byte (short) pointers to save space. int is okay for casting from pointers since it is 32 bit code:

#include <stdio.h>
#include <memory.h>

typedef struct OBJ
{
  int val;
  short left;
  short right;
#define OBJPTR(P) ((OBJ*)((P)?(int)&(P)+(P):0))
#define SETOBJPTR(P,V) ((P)=(V)?(int)(V)-(int)&(P):0)  
} OBJ;

typedef struct HEAD
{
  short top; // top of tree
  short available; // index of next available place in data block
  char data[0x7FFF]; // put whole tree here
} HEAD;

HEAD * blk;

OBJ * Add(int val)
{
  short * where = &blk->top; // find pointer to "pointer" to place new node
  OBJ * nd;
  while ( ( nd = OBJPTR(*where) ) != 0 )
    where = val < nd->val ? &nd->left : &nd->right;
  nd = (OBJ*) ( blk->data + blk->available ); // allocate node
  blk->available += sizeof(OBJ); // finish allocation
  nd->val = val;
  nd->left = nd->right = 0;
  SETOBJPTR( *where, nd );
  return nd;
}

void Dump(OBJ*top,int indent)
{
  if ( ! top ) return;
  Dump( OBJPTR(top->left), indent + 3 );
  printf( "%*s %d\n", indent, "", top->val );
  Dump( OBJPTR(top->right), indent + 3 );
}

void main(int argc,char*argv)
{
  blk = (HEAD*) malloc(sizeof(HEAD));
  blk->available = (int) &blk->data - (int) blk;
  blk->top = 0;
  Add(23); Add(2); Add(45); Add(99); Add(0); Add(12);
  Dump( OBJPTR(blk->top), 3 );
  { // PROOF a copy at a different address still has the tree:
  HEAD blk2 = *blk;
  Dump( OBJPTR(blk2.top), 3 );
  }
}

A note about based verses self-relative "*" operator. Based can involve 2 addresses and 2 memory fetches. Self-relative involves 1 address and 1 memory fetch. Pseudo assembly:

load reg1,address of pointer
load reg2,fetch reg1
add reg3,reg2+reg1

load reg1,address of pointer
load reg2,fetch reg1
load reg3,address of base
load reg4,fetch base
add reg5,reg2+reg4

Upvotes: 1

Anya Shenanigans
Anya Shenanigans

Reputation: 94769

I'm pretty sure there's nothing akin to the __based pointer from Visual Studio in GCC. The only time I'd seen anything like that built-in was on some pretty odd hardware. The Visual Studio extension provides an address translation layer around all operations involving the pointer.

So it sounds like you're into roll-your-own territory; although I'm willing to be told otherwise.

The last time I was dealing with something like this it was on the palm platform, where, unless you locked down memory, there was the possibility of it being moved around. You got memory handles from allocations and you had to MemHandleLock before you used it, and MemPtrUnlock it after you were finished using it so the block could be moved around by the OS (which seemed to happen on ARM based palm devices).

If you're insistent on storing pointer-esque values in a memory mapped structure the first recommendation would be to store the value in an intptr_t, which is an int size that can contain a pointer value. While your offsets are unlikely to exceed 4GB, it pays to stay safe.

That said, this is probably easy to implement in C++ using a template class, it's just that marking the question as C makes things a lot messier.

Upvotes: 2

Norman Gray
Norman Gray

Reputation: 12514

The first is extremely unlikely to work.

Remember that a pointer, such as struct _myStruct_t * is a pointer to a location in memory. Suppose that this structure was located at address 1000 in memory: that would mean that the next structure, located just after it, might be located at address 1008, and that's what's stored in ->next (the numbers don't matter; what matters is that they are memory addresses). Now you save that structure to a file (or un-map it). Then you map it again, but this time, it ends up starting at address 2000, but the ->next pointer is still 1008.

You have (generally) no control over where files are mapped in memory, so no control over the actual memory locations of the elements within the mapped structure. Therefore you can only depend on relative offsets.

Note that your second version may or may not work as you expect, depending on the declared type of mappedFileHandle. If it's a pointer to myStruct_t, then adding an integer n to it will produce a pointer to an address which is n*sizeof(myStruct_t) bytes higher in memory (as opposed to being n bytes higher).

If you declared mappedFileHandle as

myStruct_t* mappedFileHandle;

then you can subscript it like an array. If the mapped file is laid out as a sequence of myStruct_t blocks, and the next field refers to other blocks by index within that sequence, then (supposing myStruct_t* b is a block of interest)

mappedFileHandle[b->next].number

is the number field of the b->nextth block in the sequence.

(This is just a consequence of the way that arrays are defined in C: mappedFileHandle[b->next] is defined to be equivalent to *(mappedFileHandle + b->next), which is an object of type myStruct_t, which you can therefore get the number field of).

Upvotes: 0

Related Questions