Reputation: 374
I'm writing some Mongo queries for pieces of data that are about 12 KB each in raw JSON. I expect to grab anywhere from 5000 to 150000 of these objects for our users at the startup of a program I'm writing. Over our 100Mbps LAN, this takes a while--about 55 seconds for 50000 objects, or 6 seconds for 5000 objects. The objects don't change, so I'm fine caching them after they're in memory. But the initial query time is unacceptable. I've verified using Wireshark that the network is actually causing the bottleneck. It takes nearly a minute to get the packets of all 50000 objects, unfortunately. Converting objects, deserializing, indices, etc. are not causing an issue for me.
I suspect it would be faster if Mongo did something like compressing the data first, sending it to me, and letting me decompress it client-side. Is this a realistic suspicion, and if so, does Mongo have any facility to do this? Or, is there any other way to speed up the transmission of large query result sets? I've tried setting the batch_size higher, and it didn't help.
My environment is PyMongo on Python 3.6 on Windows. The client computer and server hardware specs are more than adequate to handle the (de)compression. I'm trying to avoid a solution that would have me write a program to put on the server to locally query and compress the data before sending it over the network to the client.
Upvotes: 1
Views: 1136
Reputation: 10918
Your observations certainly seem to make sense. Let's do the Maths:
5'000 * 12kb = 60MB
150'000 * 12kb = 1.8GB
A 100Mbps network transmits max. 750MB/min so that'd result in something between 4.8s (for 5'000 documents) and 2m:24s (for 150'000) assuming an otherwise empty wire. That's quite a bit.
If it's not an option for you to upgrade to e.g. Gigabit ethernet then there's still hope:
MongoDB v3.6 comes with protocol compression: https://emptysqua.re/blog/driver-features-for-mongodb-3-6/. It has been released a few days ago. You may need to wait for your driver to become available, though.
Also, some e.g. Cisco routers support compression which should help but obviously requires hardware and know-how again: https://www.cisco.com/c/en/us/support/docs/wan/data-compression/14156-compress-overview.html
Upvotes: 2