Massive inserts kill arangod (well, almost)

Question

I was wondering of anyone has ever encountered this:

When inserting documents via AQL, I can easily kill my arango server. For example

FOR i IN 1 .. 10
  FOR u IN users
    INSERT {
        _from: u._id,
        _to: CONCAT("posts/",CEIL(RAND()*2000)),
        displayDate: CEIL(RAND()*100000000)
    } INTO canSee

(where users contains 500000 entries), the following happens

canSee becomes completely locked (also no more reads)
memory consumption goes up
arangosh or web console becomes unresponsive
fails [ArangoError 2001: Could not connect]
server is still running, accessing collection gives timeouts
it takes around 5-10 minutes until the server recovers and I can access the collection again
access to any other collection works fine

So ok, I'm creating a lot of entries and AQL might be implemented in a way that it does this in bulk. When doing the writes via db.save method it works but is much slower.

Also I suspect this might have to do with write-ahead cache filling up.

But still, is there a way I can fix this? Writing a lot of entries to a database should not necessarily kill it.

Logs say

DEBUG [./lib/GeneralServer/GeneralServerDispatcher.h:411] shutdownHandler called, but no handler is known for task

DEBUG [arangod/VocBase/datafile.cpp:949] created datafile '/usr/local/var/lib/arangodb/journals/logfile-6623368699310.db' of size 33554432 and page-size 4096

DEBUG [arangod/Wal/CollectorThread.cpp:1305] closing full journal '/usr/local/var/lib/arangodb/databases/database-120933/collection-4262707447412/journal-6558669721243.db'

bests

stj · Accepted Answer

The above query will insert 5M documents into ArangoDB in a single transaction. This will take a while to complete, and while the transaction is still ongoing, it will hold lots of (potentially needed) rollback data in memory.

Additionally, the above query will first build up all the documents to insert in memory, and once that's done, will start inserting them. Building all the documents will also consume a lot of memory. When executing this query, you will see the memory usage steadily increasing until at some point the disk writes will kick in when the actual inserts start.

There are at least two ways for improving this:

it might be beneficial to split the query into multiple, smaller transactions. Each transaction then won't be as big as the original one, and will not block that many system resources while ongoing.
for the query above, it technically isn't necessary to build up all documents to insert in memory first, and only after that insert them all. Instead, documents read from users could be inserted into canSee as they arrive. This won't speed up the query, but it will significantly lower memory consumption during query execution for result sets as big as above. It will also lead to the writes starting immediately, and thus write-ahead log collection starting earlier. Not all queries are eligible for this optimization, but some (including the above) are. I worked on a mechanism today that detects eligible queries and executes them this way. The change was pushed into the devel branch today, and will be available with ArangoDB 2.5.

Massive inserts kill arangod (well, almost)

Answers (1)

Related Questions