rahman
rahman

Reputation: 4948

How to measure the memory usage of std::unordered_map

We know that hash table based container implementations like std::unordered_map use a lot memory but I don't know how much is how much?

Apart from space complexity notations, and not considering if a container element is a pointer to a larger object:

Is there any way to figure out how many bytes is being used by such a container during run-time?

Is there a way to tell, at runtime, how much memory any container uses?

Upvotes: 28

Views: 25041

Answers (3)

Jeremy Friesner
Jeremy Friesner

Reputation: 73379

If you'd like to observe heap memory getting allocated and freed (for unordered_map or for anything else) you can always overload the global new and delete operators to keep track of the tally and/or print out the current tally as you go. (Note: example code shown below isn't thread safe, and probably won't work reliably outside of simple toy programs, but it's good enough to experiment with)

# include <stdint.h>
# include <new>
# include <typeinfo>
# include <unordered_map>

static inline uint32_t    CONVERT_USER_TO_INTERNAL_SIZE(uint32_t uNumBytes) {return (uNumBytes+sizeof(size_t));}
static inline uint32_t    CONVERT_INTERNAL_TO_USER_SIZE(uint32_t iNumBytes) {return (iNumBytes-sizeof(size_t));}
static inline size_t *  CONVERT_USER_TO_INTERNAL_POINTER(void * uptr)   {return (((size_t*)uptr)-1);}
static inline void   *  CONVERT_INTERNAL_TO_USER_POINTER(size_t * iptr) {return ((void *)(iptr+1));}

static size_t _currentlyAllocatedBytes = 0;  // Running tally of how many bytes our process has allocated

void * instrumentedAlloc(size_t userSize)
{
   const size_t internalSize = CONVERT_USER_TO_INTERNAL_SIZE(userSize);

   void * userPtr = NULL;

   size_t * internalPtr = (size_t *) malloc(internalSize);
   if (internalPtr)
   {
      *internalPtr = internalSize;  // our little header tag so that instrumentedFree() will know how big the allocation was
      _currentlyAllocatedBytes += internalSize;
printf("instrumentedAlloc(%zu):  total allocation count is now %zu\n", internalSize, _currentlyAllocatedBytes);
      userPtr = CONVERT_INTERNAL_TO_USER_POINTER(internalPtr);
   }
   return userPtr;
}

void instrumentedFree(void * userPtr)
{
   if (userPtr)
   {
      size_t * internalPtr = CONVERT_USER_TO_INTERNAL_POINTER(userPtr);
      const size_t allocationSize = *internalPtr;
      _currentlyAllocatedBytes -= allocationSize;
printf("instrumentedFree(%p,%zu):  total allocation count is now %zu\n", userPtr, allocationSize, _currentlyAllocatedBytes);
      free(internalPtr);
   }
}

void * operator new(size_t s) throw ( std::bad_alloc )
{
   void * ret = instrumentedAlloc(s);
   if (ret == NULL) {throw std::bad_alloc ( );}
   return ret;
}

void * operator new[](size_t s) throw ( std::bad_alloc )
{
   void * ret = instrumentedAlloc(s);
   if (ret == NULL) {throw std::bad_alloc ( );}
   return ret;
}

void operator delete(  void * p) throw() {instrumentedFree(p);}
void operator delete[](void * p) throw() {instrumentedFree(p);}

int main(int, char **)
{
   printf("ABOUT TO DECLARE test_map\n");
   std::unordered_map<int, int> test_map;

   printf("ABOUT TO POPULATE test_map\n");
   for (int i=0; i<10; i++) test_map[i] = i;

   printf("ABOUT TO CLEAR test_map\n");
   test_map.clear();

   return 0;
}

... when run, the above program prints this output on my machine:

Jeremys-Mac-mini-2:~ jaf$ ./a.out
ABOUT TO DECLARE test_map
ABOUT TO POPULATE test_map
instrumentedAlloc(32):  total allocation count is now 32
instrumentedAlloc(24):  total allocation count is now 56
instrumentedAlloc(32):  total allocation count is now 88
instrumentedAlloc(32):  total allocation count is now 120
instrumentedAlloc(48):  total allocation count is now 168
instrumentedFree(0x600003be1128,24):  total allocation count is now 144
instrumentedAlloc(32):  total allocation count is now 176
instrumentedAlloc(32):  total allocation count is now 208
instrumentedAlloc(32):  total allocation count is now 240
instrumentedAlloc(96):  total allocation count is now 336
instrumentedFree(0x6000035e0248,48):  total allocation count is now 288
instrumentedAlloc(32):  total allocation count is now 320
instrumentedAlloc(32):  total allocation count is now 352
instrumentedAlloc(32):  total allocation count is now 384
instrumentedAlloc(32):  total allocation count is now 416
ABOUT TO CLEAR test_map
instrumentedFree(0x600003be1228,32):  total allocation count is now 384
instrumentedFree(0x600003be1208,32):  total allocation count is now 352
instrumentedFree(0x600003be11e8,32):  total allocation count is now 320
instrumentedFree(0x600003be11c8,32):  total allocation count is now 288
instrumentedFree(0x600003be11a8,32):  total allocation count is now 256
instrumentedFree(0x600003be1188,32):  total allocation count is now 224
instrumentedFree(0x600003be1128,32):  total allocation count is now 192
instrumentedFree(0x600003be1168,32):  total allocation count is now 160
instrumentedFree(0x600003be1148,32):  total allocation count is now 128
instrumentedFree(0x600003be1108,32):  total allocation count is now 96
instrumentedFree(0x600001fe00c8,96):  total allocation count is now 0

Upvotes: 1

Tony Delroy
Tony Delroy

Reputation: 106244

There's no portable way to tell how many bytes are in use. What you can find out is:

  • size() indicates how many data elements have been inserted into the container
  • bucket_count() indicates how many buckets the underlying hash table has, each of which can be expected to potentially hold an iterator into the linked list of elements (GCC at least maintains one linked list for all elements in the entire container, so it can always increment the iterator in O(1) time, even if the bucket_count() is high and size() low)

Now:

  • bytes actually used for element storage will be m.size() * sizeof(M::value_type)

  • bytes used for the hash table buckets depends on the way the internal lists are stored - std::unordered_map::bucket_size has constant complexity so we might conclude that there'll be a size() and linked-list iterator per bucket, so m.bucket_count() * (sizeof(size_t) + sizeof(void*)), though it may be that there's only constant amortised complexity due to the load_factor() being bounded and no size stored per bucket (I'd prefer implementing it this way myself)

  • if each of the inserted elements is part of the list, they'll need a next pointer, so we can add another m.size() * sizeof(void*)

  • some implementations, like GCC's, store the hash value with every element, allowing a quick and dirty check for inequality between elements colliding at the same bucket (which - with a decent hash function - is massively more likely to be because the hash values collide after modding into the bucket count, rather than because the hash values are identical), and avoids recalculation when resizing the bucket array; if the implementation has this, it'll likely m.size() * sizeof(size_t); as it's optional, I won't include this in the overall formula below but add it in if your implementation does this

  • each memory allocation may be rounded up to a size convenient for the memory allocation library's management - e.g. the next power of two, which means you could be allocating up to 2x the memory currently in use, but on average you're probably allocating around 1.5x as much (i.e. 50% more than you're using). So, let's add 50%, just for the list nodes as the buckets are likely powers of two given size_t and a pointer: 0.5 * size() * (sizeof(void*) + sizeof((M::value_type))

  • especially in debug mode, there may be any amount of implementation-specific housekeeping and error-detection data

Summarily, a ballpark reasonable figure is:

(m.size() * (sizeof(M::value_type) + sizeof(void*)) + // data list
 m.bucket_count() * (sizeof(void*) + sizeof(size_t))) // bucket index
* 1.5 // estimated allocation overheads

You can explore this further by creating a number of large tables and seeing how top or Process Manager reports different memory usage.

Upvotes: 25

Mine
Mine

Reputation: 4241

If you want to get a rough size, I think bucket_count() and max_load_factor() is enough, which gives the current count of buckets and the load factor.

Rational:

  • If load_factor <= 1, it indicates that bucket_count() >= the items in the map, then bucket_count() is the size of memory usage.

  • If load_factor > 1, then bucket_count() * load_factor indicates the max item in the map. Note this is the max size, not the real size.

So for a rough memory usage could look like this:

  unsigned n = mymap.bucket_count();
  float m = mymap.max_load_factor();
  if (m > 1.0) {
    return n * m;
  }
  else {
    return n;
  }

If you want to get the accurate memory usage, you may need to count all the buckets to see how many elements in it:

  size_t count = 0;
  for (unsigned i = 0; i < mymap.bucket_count(); ++i) {
    size_t bucket_size = mymap.bucket_size(i);
    if (bucket_size == 0) {
      count++;
    }
    else {
      count += bucket_size;
    }
  }

Upvotes: 9

Related Questions