Reputation: 326
Context:
I have an AWS EC2 instance
It runs Solr 5.1.0 with
-Xms2048m -Xmx2048m
Extra: (updated)
JdbcDataSource
)Situation:
The index on Solr has 200.000 documents and is queried not more than once per second. However, in about 10 days, the memory and disk space of the server reaches 90% - 95% of the available space.
When investigating the disk usage sudo du -sh /
it only returns a total of 2.3G
. Not nearly as much as what df -k
tells me (Use% -> 92%
).
I can, sort of, resolve the situation by restarting the Solr service.
What am i missing? How come Solr consumes all memory and disk space and how to prevent it?
Extra info for @TMBT
Sorry for the delay, but I’ve been monitoring the Solr production server for the last few days. You can see a roundup here: https://www.dropbox.com/s/x5diyanwszrpbav/screencapture-app-datadoghq-com-dash-162482-1468997479755.jpg?dl=0 The current state of Solr: https://www.dropbox.com/s/q16dc5t5ctl32od/Screenshot%202016-07-21%2010.29.13.png?dl=0 I restarted Solr at the beginning of the monitoring and now, 2 days later I see the disk space goes down at a rate of 1,5Gb per day. If you need more specifics, let me know.
ls -lh /var/solr/logs
-> total 72M
With the monitoring in place I tested the most common queries. It does contain faceting (field, queries), sorting, grouping, … But I doesn’t really affect the various metrics of heap and gc count.
Upvotes: 1
Views: 8614
Reputation: 326
I finally managed to solve this problem. So I'm answering my own question.
I changed / added the following lines in the log4j.properties
file which is located in /var/solr/
(the Solr root location in my case).
# log4j.rootLogger=INFO, file, CONSOLE
# adding:
log4j.rootLogger=WARN, file, CONSOLE
Lowering the logging level.
# adding:
log4j.appender.file.Threshold=INFO
Set logging treshold.
You can see in the graphs below that as of september 2nd, the disk usage is steady as it should be. The same is true for the memory consumption on the server.
Upvotes: 5
Reputation: 1183
First, visit your.solr.instance:[port]/[coreName]/admin/system
and check to see how many resources Solr is actually using. The memory
and system
elements will be most useful to you. It may be something else on the box is the culprit for at least some of the resource usage.
To me, that you can "sort of" resolve the problem by restarting Solr screams "query and import obnoxiousness" for memory. For disk space, I wouldn't be surprised if it's the log files behind that. I also wonder if you're ending up with a lot of old, deleted files due to your numerous delta imports that are lying around until Solr automatically deletes them. In fact, if you go to http://your.solr.instance:[port]/solr/#/[coreName]
, you should be able to see how many deleted docs are in your index. If there's a very, very large number, you should schedule a time during low usage to run optimize to get rid of them.
Also be aware that Solr seems to have a tendency of filling up as much of the given heap space as it can.
Since the logs are generated on the server, check to see how many of them exist. Solr after 4.10 has a nasty habit of generating large numbers of log files, which can cause disk space issues, especially with how often you import. For information on how to deal with Solr's love of logging, I'm going to refer to my self-answer at Solr 5.1: Solr is creating way too many log files. Basically you'll want to navigate to solr
startup script to disable Solr's log backups and then replace that with a solution of your own.
If you have a master-slave setup, check to see if the slave is backing up certain configuration files, like schema.xml
or solrconfig.xml
.
Depending on how many records are imported per delta, you could have commits overlapping each other, which will affect resource usage on your box. If in the logs you read anything about overlapping ondecksearchers
this is definitely an issue for you.
Lots of delta imports also means lots of commits. Commit is a fairly heavy operation. You'll want to tweak solrconfig.xml
to soft commit after a number of documents and a hard commit after a little bit more. If you perform the commits in batches, your frequent deltas should have less of an impact.
If you are joining columns for your imports, you may need to index those joined columns in your database. If your database is not on the same machine as Solr, network latency is a possible problem. It's one I've struggled with in the past. If the DB is on the same machine and you need to index, then not indexing will most certainly have a negative effect on your box's resources.
It may be helpful to you to use something like VisualVM on Solr to view heap usage and GC. You want to make sure there's not a rapid increase in usage and you also want to make sure that the GC isn't having a bunch of stop-the-world collections that can cause weirdness on your box.
Optimize is a very intensive operation that you shouldn't need to use often, if at all, after 4.10. Some people still do, though, and if you have tons of deleted documents it might be useful to you. If you ever decide to employ an optimization strategy, it should be done only during times of low usage, as optimize temporarily doubles the size of your index. Optimize merges segments and removes files marked for deletion by deltas.
By "large fields", I mean fields with large amounts of data in them. You would need to look up the size limits for each field type you're using, but if you're running towards the max size for a certain field, you may want to try to find a way to reduce the size of your data. Or you can omit importing those large columns into Solr and instead retrieve the data from the columns in the source DB after getting a particular document(s) from Solr. It depends on your set up and what you need. You may or may not be able to do much about it. If you get everything else running more efficiently you should be fine.
The type of queries you run can also cause you problems. Lots of sorting, faceting, etc can be very memory-intensive. If I were you, I would hook VisualVM up to Solr so I could watch heap usage and GC, and then load test Solr using typical queries.
Upvotes: 3