user1802604
user1802604

Reputation: 396

Poor performance for traversing all the documents from mongodb using Java

I store the system log in mongodb with building index according to 'time' field. There are about 190k logs in mongodb now. When I try to get all logs using DBCollection.find() method in Java, it spends almost 10 seconds for traversing all the documents in the collection. I think there might be something I have missed that cause the poor performance?

Here is the code I used:

mongo = new Mongo();
DB db = mongo.getDB("Log");
DBCollection coll = db.getCollection("SystemLog");
int count = 0;

long findStart = Calendar.getInstance().getTimeInMillis();

// Sort by time.
BasicDBObject queryObj = new BasicDBObject();
queryObj.put("time", -1);

DBCursor cursor = coll.find().sort(queryObj);
while(cursor.hasNext()) {
    DBObject obj = cursor.next();
    // Do something
    ++count;
}
long findEnd = Calendar.getInstance().getTimeInMillis();
System.out.println("Time for traversing all system logs (" + count + "):\t" + (findEnd-findStart) + "ms.");

And the printed result is:

Time for traversing all system log (194309):    10496ms.

I have tried it several times. It seems no difference between running one time or multiple times. Though I have also tried to remove sort() and just find all the logs out from mongodb. It takes about 6 seconds for traversing all the documents. The time is still kind of unacceptable for my requirement. Is there any implementation tips that can speed up the traversing work?

Many thanks.

Upvotes: 3

Views: 900

Answers (1)

VladT
VladT

Reputation: 360

Do you really need to traverse all documents ? In the code above it looks like you are just bringing into memory each object, one by one.

  1. The index on 'time' field should be constructed as 'descending' since you are sorting like that.
  2. If the index is compound (has more fields in the index, not just 'time') make sure you also add an index just with 'time'. Also, when you will add a filter to that query, make sure the 'time' field is added last in the index and descending.
  3. The performance is not that bad, considering you are reading 190k objects one by one.

(please note that my experience with mongodb does Not involve working with the Java driver)

Upvotes: 1

Related Questions