Solr server memory and disk space

Question

Context:

I have an AWS EC2 instance

8Gb RAM
8Gb of disk space

It runs Solr 5.1.0 with

Java Heap of 2048Mb
-Xms2048m -Xmx2048m

Extra: (updated)

Logs are generated on the server
Imports happen in intervals of 10s (always delta)
Importing from DB (JdbcDataSource)
I don't think I have any optimization strategy configured right now
GC profiling? I don't know.
How can I find out how large the fields are .. and what is large?

Situation:

The index on Solr has 200.000 documents and is queried not more than once per second. However, in about 10 days, the memory and disk space of the server reaches 90% - 95% of the available space.

When investigating the disk usage sudo du -sh / it only returns a total of 2.3G. Not nearly as much as what df -k tells me (Use% -> 92%).

I can, sort of, resolve the situation by restarting the Solr service.

What am i missing? How come Solr consumes all memory and disk space and how to prevent it?

Extra info for @TMBT

Sorry for the delay, but I’ve been monitoring the Solr production server for the last few days. You can see a roundup here: https://www.dropbox.com/s/x5diyanwszrpbav/screencapture-app-datadoghq-com-dash-162482-1468997479755.jpg?dl=0 The current state of Solr: https://www.dropbox.com/s/q16dc5t5ctl32od/Screenshot%202016-07-21%2010.29.13.png?dl=0 I restarted Solr at the beginning of the monitoring and now, 2 days later I see the disk space goes down at a rate of 1,5Gb per day. If you need more specifics, let me know.

There are not so many deleted docs per day. We’re talking 50 - 250 per day max.
The current logs directory of Solr: ls -lh /var/solr/logs -> total 72M
There is no master-slave setup
The importer runs ever 10 seconds, but it imports no more than 10 - 20 docs each time. The large import of 3k-4k docs happens each night. There is not much action going on in Solr at that time.
There are no large fields, the largest field can contain up to 255 chars.

With the monitoring in place I tested the most common queries. It does contain faceting (field, queries), sorting, grouping, … But I doesn’t really affect the various metrics of heap and gc count.

pierot · Accepted Answer

I finally managed to solve this problem. So I'm answering my own question.

I changed / added the following lines in the log4j.properties file which is located in /var/solr/ (the Solr root location in my case).

# log4j.rootLogger=INFO, file, CONSOLE
# adding:
log4j.rootLogger=WARN, file, CONSOLE

Lowering the logging level.

# adding:
log4j.appender.file.Threshold=INFO

Set logging treshold.

You can see in the graphs below that as of september 2nd, the disk usage is steady as it should be. The same is true for the memory consumption on the server.

Solr server memory and disk space

Answers (2)

Related Questions