Reputation: 1698
I am new ArangoDB user and I am using the following query
FOR i IN meteo
FILTER
i.`POM` == "Maxial"
&& TO_NUMBER(i.`TMP`) < 4.2
&& DATE_TIMESTAMP(i.`DTM`) > DATE_TIMESTAMP("2014-12-10")
&& DATE_TIMESTAMP(i.`DTM`) < DATE_TIMESTAMP("2014-12-15")
RETURN
i.`TMP`
on a 2 million document collection. It has an index on the three fields that are filtered. It takes aprox. 9 secs on the Web Interface.
Is it possible to run it faster?
Thank you
Hugo
Upvotes: 2
Views: 629
Reputation: 9097
I have no access to the underlying data and data distribution nor the exact index definitions, so I can only give rather general advice:
explain()
command in order to see if the query makes use of indexes, and if yes, which.explain()
shows that no index is used, check if the attributes contained in the query's FILTER conditions are actually indexed. There is the db.<collection>.getIndexes()
command to check which attributes are indexed.==
) but not for other comparison types (<
, <=
, >
, >=
etc.). A hash index will only be used if all the indexed attributes are used in the query's FILTER conditions. A skiplist index will only be used if at least its first attribute is used is used in a FILTER condition. If further of the skiplist index attributes are specified in the query (from left-to-right), they may also be used and allow to filter more documents. "POM"
, "TMP"
, and "DTM"
won't help this query because it will only use one of them per collection that it iterates over. Instead, I suggest trying to put multiple attributes into an index if the query could benefit from this.[ "POM", "DTM" ]
may be the right choice (in combination with 6.)"POM"
attribute. The reason is that the other attributes are used inside function calls (i.e. TO_NUMBER()
, DATE_TIMESTAMP()
). In general, indexes will not be used for attributes which are used inside functions (e.g. for TO_NUMBER(i.tmp) < 4.2
no index will be used. Same for DATE_TIMESTAMP(i.DTM) > DATE_TIMESTAMP("2014-12-10")
. Modifying the conditions so the indexed attributes are directly compared to some constant or a one-time calculated value can enable more candidate indexes. If possible, try to rewrite the conditions so that only the indexed attributes are present on the one side of the comparison. For this particular query, it would be better to use i.DTM > "2014-12-10"
instead of DATE_TIMESTAMP(i.DTM) > DATE_TIMESTAMP("2014-12-10")
.Upvotes: 3