Reputation: 627
I have about 130 million articles in my Postgres database on AWS. I am trying to index them with elasticsearch. In a screen, I entered:
python manage.py search_index --rebuild -f --parallel --model [APP NAME].[MODEL NAME]
Everything began correctly. The output was
Deleting index '[MODEL NAME]'
Creating index '[MODEL NAME]'
Indexing 129413202 'MODEL NAME' objects (parallel)
But after about 15 hours, the output was "Killed". I was running this on a t2.xlarge EC2 instance, which has 16 GBs of memory. Interestingly, the "Killed" message happened after I saw that the connection to the AWS server was broken, but that shouldn't matter if the process was run in a screen. Any idea what the issue is? Do I just need to get an even larger EC2 instance?
Upvotes: 0
Views: 2258
Reputation: 1896
A process unexpectedly exiting with message Killed
often means it received a SIGKILL
; if so then the exit code would be 137. Hard to be certain here, a process can obviously print Killed
and exit with code 137 anyway, but assuming you're not doing that in your code then this is what I'd check next.
An unexpected SIGKILL
often comes from the kernel's OOM killer which takes action when the system runs out of memory and typically kills the process with the largest memory footprint. If so it will have logged details in the kernel logs that you can read with dmesg
.
If it was the OOM killer then this sounds like a bug in this indexing code. Indexing a large body of documents into Elasticsearch should require pretty limited working memory, nowhere near 16GB, but it's easy to accidentally keep too much data in memory for too long which would lead to excessive memory usage.
python manage.py search_index
suggests you're using the Django Elasticsearch DSL which fixed a performance issue relatively recently. Make sure you're using a version that contains this fix.
Upvotes: 1