Reputation: 11725
I have the following query
SELECT FROM Post
WHERE activityLog IS NOT NULL
AND activityLog_pointerProcessed IS NULL
LIMIT 1000
The properties are defined as
CREATE PROPERTY Post.activityLog_pointerProcessed BOOLEAN
CREATE PROPERTY Post.activityLog EMBEDDEDMAP EMBEDDEDMAP
However, it seems to be running fairly slowly. I have added the following index
CREATE INDEX Post_activityLog_pointerProcessed
ON Post (activityLog, activityLog_pointerProcessed)
NOTUNIQUE_HASH_INDEX
I'm wondering if this is the optimal index for this scenario or if there's a different way to set this up. The Post class is still filling up but currently has 2 million rows. The query takes ~5-10 seconds to run.
Here's what I get when I explain
it
orientdb {pumpup}> explain SELECT FROM Post WHERE activityLog IS NOT NULL AND activityLog_pointerProcessed IS NULL LIMIT 1000
Profiled command '{documentReads:378236,current:#22:378235,documentAnalyzedCompatibleClass:378236,recordReads:378236,fetchingFromTargetElapsed:337922,evaluated:378236,elapsed:337922.88,resultType:collection,resultSize:1000}' in 338.688995 sec(s): {"@type":"d","@version":0,"documentReads":378236,"current":"#22:378235","documentAnalyzedCompatibleClass":378236,"recordReads":378236,"fetchingFromTargetElapsed":337922,"evaluated":378236,"elapsed":337922.88,"resultType":"collection","resultSize":1000,"@fieldTypes":"documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,fetchingFromTargetElapsed=l,evaluated=l,elapsed=f"}
Upvotes: 1
Views: 1187
Reputation: 9060
By reading the explain result, the index hasn't been used and 378,236 documents were scanned.
The activityLog property has been declared as an EMBEDDEDMAP, so the internal values are indexed, not the fact the collection is null. I don't think it will work, but make a try enabling null values on index:
https://github.com/orientechnologies/orientdb/wiki/Indexes#null-values
Probably the best for this case is saving another boolean value "activityLog.empty" to be indexed and retrieve your query much faster.
CREATE PROPERTY Post.activityLogEmpty BOOLEAN
CREATE INDEX Post_activityLog_pointerProcessed ON Post (activityLogEmpty, activityLog_pointerProcessed)
NOTUNIQUE_HASH_INDEX
On creation set the property "activityLogEmpty" to FALSE and as soon as you populate the "activityLog" collection set it to true.
Upvotes: 2