David Lynch
David Lynch

Reputation: 73

Optimizing Compound Mongo GeoSpatial Index

I have a MongoDB $within that looks like this:

db.action.find( { $and : [
    { actionType : "PLAY" },
    { 
        location : { 
            $within : { 
                $polygon : [ [ 0.0, 0.1 ], [ 0.0, 0.2 ] .. [ a.b, c.d ] ]
            } 
        }
    }
] } ).sort( { time : -1 } ).limit(50)

With regard to the action collection documents

The collection contains the following indexes

# I am interested recent actions 
db.action.ensureIndex({"time": -1}          

# I am interested in recent actions by a specific user
db.action.ensureIndex({"userId" : 1}, "time" -1}    

# I am interested in recent actions that relate to a unique song id
db.action.ensureIndex({"songId" : 1}, "time" -1}    

I am experimenting with the following two indexes

Identical queries with each index are explained below:

LocationOnly

{
    "cursor":"BasicCursor",
    "isMultiKey":false,
    "n":50,
    "nscannedObjects":91076,
    "nscanned":91076,
    "nscannedObjectsAllPlans":273229,
    "nscannedAllPlans":273229,
    "scanAndOrder":true,
    "indexOnly":false,
    "nYields":1,
    "nChunkSkips":0,
    "millis":1090,
    "indexBounds":{},
    "server":"xxxx"
}

LocationPlusTime

{
    "cursor":"BasicCursor",
    "isMultiKey":false,
    "n":50,
    "nscannedObjects":91224,
    "nscanned":91224,
    "nscannedObjectsAllPlans":273673,
    "nscannedAllPlans":273673,
    "scanAndOrder":true,
    "indexOnly":false,
    "nYields":44,
    "nChunkSkips":0,
    "millis":1156,
    "indexBounds":{},
    "server":"xxxxx"
}

Given

My questions are

My speculative thoughts are

UPDATE A sample document looks like this.

{ "_id" : "adba1154f1f3d4ddfafbff9bb3ae98f2a50e76ffc74a38bae1c44d251db315d25c99e7a1b4a8acb13d11bcd582b9843e335006a5be1d3ac8a502a0a205c0c527", 
  "_class" : "ie.soundwave.backstage.model.action.Action", 
  "time" : ISODate("2013-04-18T10:11:57Z"),
  "actionType" : "PLAY",
  "location" : { "lon" : -6.412839696767714, "lat" : 53.27401934563561 },
  "song" : { "_id" : "82e08446c87d21b032ccaee93109d6be", 
             "title" : "Motion Sickness", "album" : "In Our Heads", "artist" : "Hot Chip"
           }, 
  "userId" : "51309ed6e4b0e1fb33d882eb", "createTime" : ISODate("2013-04-18T10:12:59.127Z") 
}

UPDATE The geo-query looks like this https://www.google.com/maps/ms?msid=214949566612971430368.0004e267780661744eb95&msa=0&ll=-0.01133,-0.019226&spn=0.14471,0.264187

For various reasons approximately 250,000 documents exist in our DB at the point 0.0

Upvotes: 1

Views: 1371

Answers (1)

David Lynch
David Lynch

Reputation: 73

I played with this for a number of days and got the result I was looking for.

Firstly, given that action types other than "PLAY" CAN NOT have a location the additional query parameter "actionType==PLAY" was unnecessary and removed. Straight away I flipped from "time-reverse-b-tree" cursor to "Geobrowse-polygon" and for my test search latency improved by an order of 10.

Next, I revisited the 2dsphere as suggested by Derick. Again another latency improvement by roughly 5. Overall a much better user experience for map searches was achieved.

I have one refinement remaining. Queries in areas where there are no plays for a number of days have generally increased in latency. This is due to the query looking back in time until it can find "some play". If necessary, I will add in a time range guard to limit the search space of these queries to a set number of days.

Thanks for the hints Derick.

Upvotes: 2

Related Questions