Will
Will

Reputation: 75625

creating a cache of byte arrays

In my code, which often runs on servers I do not control the configuration of, I have collection of users and each user has a byte[] array.

Sometimes these byte[] arrays are unique to the user. Often, though, there will be large numbers users with the exact same byte[] array.

I am trying to reduce the RAM consumption of my server.

I've tried turning my byte[] arrays into strings and interning them, but then I often run into PERM-GEN out-of-memory errors. I also see a significant performance degradation with the encoding/decoding when I want to access the byte[] array for a user, and I see much increased worse-case memory usage - presuambly strings are much larger than arrays.

How can I have a Set<SoftReference<byte[]>> lookup when Java arrays are not hashable and SoftReferences do not wrap the hash of the object the point at either. A Map<byte[],SoftReference<byte[]>> is obviously also defeating itself because the key is itself and prevents collection; and Set is internally implemented in terms of Mapanyway.

So how can I intern byte[] arrays?

Upvotes: 14

Views: 7164

Answers (3)

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 136012

I would implement a cache based on Guava weak values map. It guarantees that if theres no more strong references to byte array the entry will be automatically removed.

class Cache {
    private final ConcurrentMap<Key, byte[]> map = new MapMaker().weakValues().makeMap();

    private static class Key {
        byte[] a;
        int hash;

        Key(byte[] a) {
            this.a = a;
            hash = Arrays.hashCode(a);
        }

        @Override
        public int hashCode() {
            return hash;
        }

        @Override
        public boolean equals(Object obj) {
            if (obj instanceof Key) {
                return Arrays.equals(a, ((Key) obj).a);
            }
            return false;
        }
    }

    public byte[] intern(byte[] a) {
        byte[] a1 = map.putIfAbsent(new Key(a), a);
        if (a1 != null) {
            return a1; 
        }
        return a;
    }
}

Upvotes: 1

gma
gma

Reputation: 2563

If you effectively have many identical arrays, use an HashSet<ByteBuffer> as a cache. You can get the ByteBuffer array with method array() and the ByteBuffer class has hashCode and equals methods. Of course it is better if your arrays are immutable.

EDIT2 The comment from @Will is exact, to be able to get back the array, use a WeakHashMap<ByteBuffer,WeakReference<ByteBuffer>> and do something like that :

public byte[] internalize(byte[] bytes) {
 ByteBuffer wrapped = ByteBuffer.wrap(bytes);
 if(cache.containsKey(wrapped)) {
  wrapped = cache.get(wrapped).get();
 }
 else {
  cache.put(wrapped, new WeakReference<ByteBuffer>(wrapped);
 }
 return wrapped.array();
}

Upvotes: 5

Raedwald
Raedwald

Reputation: 48684

I've tried turning my byte[] arrays into strings and interning them, but then I often run into PERM-GEN out-of-memory errors.

I agree that you need something like String.intern(), but the standard implementation is native, so not much joy.

You could have a Map<Integer,Collection<SoftReference<byte[]>>>, using the hash code of the byte arrays as the Map key. Your intern method could then look up the set of existing byte arrays with the same has code as the given byte arrays. With a good hash code that should give a small set of arrays to examine for a match.


Edit: To clarify:

Something like this:

 class ByteArrayCache
 {
      private final Map<Integer,Collection<SoftReference<byte[]>> map = new ...;

      public final byte[] intern(byte[] byteArray)
      {
           final int hash = Arrays.hashCode(byteArray);
           final Collection<SoftReference<byte[]>> arrays = map.get(hash);
           if (arrays != null) {
              // Search through arrays for a match, and return the match.
              // If no match found, add byteArray to the collection and return it
           } else {
              // create a new map entry, add byteArray to it, and return byte array
           }
      }
 }

Upvotes: 2

Related Questions