THM
THM

Reputation: 671

How to configure Lucene (SOLR) internal caching - memory issue/leak?

I am using SOLR 4.4.0 - I found (possible) issue related to internal caching mechanism. JVM: -Xmx=15g but 12g was never free. I created heap dump and analyze it using MemoryAnyzer - I found 2 x 6Gb used as cache data. In second time I do the same for -Xmx12g - I found 1 x 3.5Gb It was always the same cache. I check in source code and I found:

  /** Expert: The cache used internally by sorting and range query classes. */
  public static FieldCache DEFAULT = new FieldCacheImpl();

see http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-core/4.4.0/org/apache/lucene/search/FieldCache.java#FieldCache.0DEFAULT

This is very bad news because it is public static field and it is used in about 160 places in source code.

MemoryAnalyzer say:

One instance of "org.apache.lucene.search.FieldCacheImpl" loaded by "org.apache.catalina.loader.WebappClassLoader @ 0x58c3a9848" occupies 4,103,248,240 (80.37%) bytes. The memory is accumulated in one instance of "java.util.HashMap$Entry[]" loaded by "".

Keywords java.util.HashMap$Entry[] org.apache.catalina.loader.WebappClassLoader @ 0x58c3a9848 org.apache.lucene.search.FieldCacheImpl

I do not know how to manage this kind of caches - any advice?

And finally I got OutOfMemoryError + 12Gb of memory is blocked.

Upvotes: 2

Views: 1517

Answers (2)

THM
THM

Reputation: 671

I implemented kind of workaround:

I created this kind of class:

public class InternalApplicationCacheManager implements InternalApplicationCacheManagerMBean {

    public synchronized int getInternalCacheSize() {
        return FieldCache.DEFAULT.getCacheEntries().length;
    }

    public synchronized void purgeInternalCaches() {
        FieldCache.DEFAULT.purgeAllCaches();
    }
}

and registered it in JMX via org.apache.lucene.search.FieldCacheImpl

    ...
          private synchronized void init() {
    ...
            initBeans();

          }

          private void initBeans() {
              try {
                  InternalApplicationCacheManager cacheManagerMBean = new InternalApplicationCacheManager();
                  MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
                  ObjectName name = new ObjectName("org.apache.lucene.search.jmx:type=InternalApplicationCacheManager");
                  mbs.registerMBean(cacheManagerMBean, name);
              } catch (InstanceAlreadyExistsException e) {
...
              }
          }
...

This solution provide you invalidate internal caches - which solve partially this issue. Unfortunately there are other places (mostly caches) where some data is stored and not removed as fast as I expect.

Upvotes: 1

fatih
fatih

Reputation: 1395

If you use FieldCacheRangeFilter you may wanna try range filters which work without field cache. If sorting is an issue, you may try using less sort fields or ones with a data type using less memory.

The field cache for each reader/atomic reader is thrown away when the reader is garbage collected. so a re-initialization of the reader should clear the cache which also means that the first operation using the cache will be a lot slower.

Fact is: FieldCache based range filter and sorting relies on the cache. There is no getting around when you really need those. You only can adapt your usage to minimize the memory consumption.

Upvotes: 0

Related Questions