Reputation: 90
I'm trying to index a large file repository (10,000+ documents of every format under the sun) using Solr 6.1.0's simpleposttool (bin/post). It'll run for quite a while with no problems then crash with the following:
POSTing file ThingsGoingWellUpToHere.pdf (application/pdf) to [base]/extract
POSTing file EXAMPLE1.pdf (application/pdf) to [base]/extract
SimplePostTool: WARNING: IOException while reading response: java.net.SocketException: Unexpected end of file from server
POSTing file EXAMPLE2.pptx (application/vnd.openxmlformats-officedocument.presentationml.presentation) to [base]/extract
SimplePostTool: FATAL: Connection error (is Solr running at http://localhost:8983/solr/sample/update ?): java.net.ConnectException: Connection refused
At this point Solr goes down too:
$ solr status
Found 1 Solr nodes:
Solr process 26499 from /opt/solr-6.1.0/bin/solr-8983.pid not found.
I wind up having to solr restart
whenever this happens. Anyone else run into a similar issue?
Quick note, if I had to take a wild guess it's something to do with corrupt files. The collection I'm working with is ~25Gb and has gone through two layers of SCP on spotty connections. If this turns out to be the case I'll close this out myself.
EDIT: Tried posting individual documents that SimplePostTool had failed on and they went through fine, so it's unlikely to be a corruption issue. The search continues...
Upvotes: 0
Views: 1147
Reputation: 90
It was totally a memory issue. If you see this error, assume you didn't allocate enough memory to your Solr instance. Just bump this up with the -Xmx
flag when using solr start
.
Upvotes: 1