Reputation: 746
I'm writing a migration program to transform the data in one database collection to another database collection using Perl and MongoDB. Millions of documents need to be transformed and performance is very bad (it will take weeks to complete, which is not acceptable). So I thought to use Parallel::TaskManager to create multiple processes to do the transformation in parallel. Performance starts OK, then rapidly tails off, and then I start getting the following errors:
update error: MongoDB::NetworkTimeout: Timed out while waiting for socket to become ready for reading
at /usr/local/share/perl/5.18.2/Meerkat/Collection.pm line 322.
update error: MongoDB::NetworkTimeout: Timed out while waiting for socket to become ready for reading
at /usr/local/share/perl/5.18.2/Meerkat/Collection.pm line 322.
So my suspicion is that this is due to spawned processes not letting go of sockets quick enough. I'm not sure how to fix this, though if in fact this is the problem.
What I've tried:
sudo sysctl -w net.ipv4.tcp_keepalive_time=120
and restarted my mongodHere's details on my setup
Single Mongod, no replication or sharding.
Both databases are on this server the perl program is iterating over the original database and doing some processing on that data in the document and writing to 3 collections in the new database.
Using MongoDB::Client to access original database and using Meerkat to write to new database. write_safety set to zero for both.
Not sure how to read this but here is a segment of mongostat from the time the errors were occurring:
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
*0 *0 *0 *0 0 1|0 0.0 0.3 0 20.4G 9.4G 0|0 1|35 79b 15k 39 11:10:37
*0 3 8 *0 0 11|0 0.0 0.3 0 20.4G 9.4G 0|0 2|35 5k 18k 39 11:10:38
*0 3 1 *0 1 5|0 0.1 0.3 0 20.4G 9.4G 0|0 1|35 2k 15m 39 11:10:39
*0 12 4 *0 1 13|0 0.1 0.3 0 20.4G 9.4G 0|0 2|35 9k 577k 43 11:10:40
*0 3 1 *0 3 5|0 0.1 0.3 0 20.4G 9.4G 0|0 1|34 2k 10m 43 11:10:41
*0 3 8 *0 1 10|0 0.1 0.3 0 20.4G 9.4G 0|0 2|34 5k 2m 43 11:10:42
*0 9 24 *0 0 29|0 0.1 0.3 0 20.4G 9.4G 0|0 5|34 13k 24k 43 11:10:43
*0 3 8 *0 0 10|0 0.1 0.3 0 20.4G 9.4G 0|0 5|35 4k 12m 43 11:10:44
*0 3 8 *0 0 11|0 0.1 0.3 0 20.4G 9.4G 0|0 5|35 5k 12m 42 11:10:45
*0 *0 *0 *0 0 2|0 0.1 0.3 0 20.4G 9.3G 0|0 4|35 211b 12m 42 11:10:46
Please let me know if you would like to see any additional information to help me diagnose this problem.
Dropping the number of processes running in parallel down to 3 from 8 (or more) seems to cut down the number of timeout errors, but at the cost of throughput.
Upvotes: 1
Views: 482
Reputation: 746
None of the tuning suggestion helped, nor did bulk inserts.
I continued to investigate and the root of the problem was that that my process was doing many "$addToSet" operations, which can become slow with large arrays. So my I was consuming all available sockets with slow updates. I restructured my documents so that I would not use arrays that could become large and I returned to an acceptable insert rate.
Upvotes: 1