danarmak
danarmak

Reputation: 1190

Mongos takes hours to 'show collections'

I have a mongo 2.4.8 cluster. My software dynamically partitions data, and I now have about 30,000 sharded collections. The cluster currently contains only one shard (which is a replica set); it is a cluster to allow easy future expansion.

When I start a new mongos process and run show collections, it takes it several hours to complete. During this time the mongos is unresponsive to all clients (but the cluster is fine). If I never run show collectoins, all other operations through the mongos work normally.

Eventually show collections completes and after that the mongos works fine, and running show collections again on the same mongos returns right away. I only found out there was a problem when I needed to restart a mongos for the first time in many months, during which the collection count rose greatly.

Logically, it would seem that data transfer (about collection chunks) from the config servers to the new mongos is the bottleneck. But neither side shows high CPU or network activity while this is going on.

Is this known behavior? How can I further investigate the problem?

Upvotes: 3

Views: 124

Answers (1)

danarmak
danarmak

Reputation: 1190

I traced the problem to a faulty config server. After replacing it, everything is working fine again.

Details: the bad server didn't respond to queries, after which they were re-sent to other servers. This created an effective latency for each request to the config servers, which was most pronounced in the 'show collections' operation that does at least one roundtrip per collection between mongos and the config servers, and does them all serially.

Upvotes: 1

Related Questions