mgo
mgo

Reputation: 41

Slow sort with skiplist?

I'm obviously missing something, when I run the following query I get a 40ms response:

FOR t IN tt
FOR a in aa 
FILTER a.from._key == t._key
AND a.value == @v
LIMIT 20 RETURN t

But when I add a sort the query runs until I have to kill it!

FOR t IN tt
FOR a in aa
FILTER a.from._key == t._key
AND a.value == @v
SORT t.identifier.value ASC
LIMIT 20 RETURN t

I have skiplist index on identifier.value but when I look at the execution plan I see a scary full collection scan.

2 EnumerateCollectionNode 240126 - FOR t IN tt   /* full collection scan */ 

I have 250000 documents in tt like:

{
  “_key”: "CA0350ED-4929-3D1A-37A5-8AB951CE3AB5"
  "type": “f.g.h”,
  "identifier": {
    “dist”: “N”,
    "value": "Matt100145 ZZZ"
  }
}

and 500000 documents in aa like:

{
  “_key”: "1BBDE335-B118-3ED5-DE3B-77D822822210"
  "type": “a.b.c”,
  "from": {
    "_key": "CA0350ED-4929-3D1A-37A5-8AB951CE3AB5",
    "value": "Matt100145 ZZZ”
    }
  },
  "value": “ZZZ”
}

All identifier.value values in tt are MattNUMBER ZZZ (test data) so maybe that is causing the trouble?

My query plan without the sort looks like:

Execution plan:
 Id   NodeType        Est.   Comment
  1   SingletonNode      1   * ROOT
  9   IndexNode          2     - FOR a IN aa   /* hash index scan */
  8   IndexNode          2       - FOR t IN tt   /* primary index scan */
  6   LimitNode          2         - LIMIT 0, 20
  7   ReturnNode         2         - RETURN t

Indexes used:
 By   Type      Collection   Unique   Sparse   Selectivity   Fields        Ranges
  9   hash      aa   false    false        50.00 %   [ `value` ]   (a.`value` == "Matt123")
  8   primary   tt       true     false       100.00 %   [ `_key` ]    (a.`from`.`_key` == t.`_key`)

and with the SORT looks like:

Execution plan:
 Id   NodeType                   Est.   Comment
  1   SingletonNode                 1   * ROOT
  2   EnumerateCollectionNode   85706     - FOR t IN tt   /* full collection scan */
  6   CalculationNode           85706       - LET #4 = t.`identifier`.`value`   /* attribute expression */   /* collections used: t : tt */
 10   IndexNode                 85706       - FOR a IN aa   /* hash index scan */
 11   CalculationNode           85706         - LET #2 = (a.`value` == "Matt123")   /* simple expression */   /* collections used: a : aa */
  5   FilterNode                85706         - FILTER #2
  7   SortNode                  85706         - SORT #4 ASC
  8   LimitNode                    20         - LIMIT 0, 20
  9   ReturnNode                   20         - RETURN t

Indexes used:
 By   Type   Collection   Unique   Sparse   Selectivity   Fields            Ranges
 10   hash   aa   false    false        50.00 %   [ `from._key` ]   (a.`from`.`_key` == t.`_key`)

Can anyone help?

Upvotes: 1

Views: 91

Answers (1)

stj
stj

Reputation: 9097

It's unclear in your case which indexes are actually there on the two collections. One index is mentioned, but to run well, two indexes would be even better. Here is the setup I used:

db.tt.ensureIndex({ type: "skiplist", fields: ["identifier.value"] });
db.aa.ensureIndex({ type: "skiplist", fields: ["value"] });

With this setup and some sample data, the execution plan I am getting with ArangoDB 3.3.12 is:

Query string: FOR t IN tt FOR a in aa FILTER a.from._key == t._key AND a.value == @v LIMIT 20 RETURN t

Execution plan:
 Id   NodeType         Est.   Comment
  1   SingletonNode       1   * ROOT
  9   IndexNode       25000     - FOR a IN aa   /* skiplist index scan */
  8   IndexNode       25000       - FOR t IN tt   /* primary index scan */
  6   LimitNode          20         - LIMIT 0, 20
  7   ReturnNode         20         - RETURN t

Indexes used:
 By   Type       Collection   Unique   Sparse   Selectivity   Fields        Ranges
  9   skiplist   aa           false    false            n/a   [ `value` ]   (a.`value` == 123)
  8   primary    tt           true     false       100.00 %   [ `_key` ]    (a.`from`.`_key` == t.`_key`)

This seems to be a good execution plan. Can you check if these indexes help in your case too?

(note: next time please post all indexes you have created plus the full execution plan and not just a fragment - that way you may get more precise help)

Upvotes: 1

Related Questions