Reputation: 4660
I have been trying to duplicate some relational DB queries into OpenSearch queries, and I have just crashed into some problems with nested objects and filtering. From what I gather, in order to properly use AND
operator logic on a nested object, you have to have ALL queries/filters under a single nested query. Having multiple nested queries, for the same nested object, joined by a must
operator won't cut it. Am I right on this? In other words:
Pseudocode:
NestedQuery(t1.Key = 1 && t1.Key = 5) != NestedQuery(t1.Key = 1) && NestedQuery(t1.Key = 5)
But if that is true, then what happens when you have many nested objects on your documents, and queries with logic that does not (easily? not at all?) allow the split of the queries in such a way, in order to get all the filters for a nested object under a single nested query?
For example, this SQL:
SELECT *
FROM Document d
INNER JOIN SubDocumentType1 t1 ON t1.Id = d.Id -- nested document 1
INNER JOIN SubDocumentType2 t2 ON t2.Id = d.Id -- nested document 2
WHERE (
(t1.Key = 1 OR t2.Key = 7) AND (t1.Key = 5 OR t2.Key = 9)
)
Here, t1.Key = 1
and t1.Key = 5
cannot be under the same nested query, since there is the extra t2
logic that is convoluted around it, and there is no single nested path for both nested document queries (since they are two different nested documents), nor a very obvious way to decompose this logic into something that keeps t1
logic under a nested query and t2
logic under another nested query without breaking the logic.
Is there a way to duplicate something like this in OpenSearch? My boolean algebra is a bit rusty, and perhaps someone can "simplify" expressions like this for OpenSearch, but is there any better and more automatic way? Maybe a library that converts queries like the above into something OpenSearch can handle?
I would guess that the above SQL query could be rewritten like so (if I am not mistaken...):
SELECT *
FROM Document d
INNER JOIN SubDocumentType1 t1 ON t1.Id = d.Id -- nested document 1
INNER JOIN SubDocumentType2 t2 ON t2.Id = d.Id -- nested document 2
WHERE (
(t1.Key = 1 AND t1.Key = 5) OR (t2.Key = 7 AND t2.Key = 9) OR
(t1.Key = 1 AND t2.Key = 9) OR (t2.Key = 7 AND t1.Key = 5)
)
...that seems to be a correct transfer of the query which could work in OpenSearch world, like in this pseudocode:
NestedQuery(t1.Key = 1 && t1.Key = 5) ||
NestedQuery(t2.Key = 7 && t2.Key = 9) ||
( NestedQuery(t1.Key = 1) && NestedQuery(t2.Key = 9) ) ||
( NestedQuery(t2.Key = 7) && NestedQuery(t1.Key = 5) )
...which can be simplified into this:
( NestedQuery(t1.Key = 1) && NestedQuery(t2.Key = 9) ) ||
( NestedQuery(t2.Key = 7) && NestedQuery(t1.Key = 5) )
...but is there any way to have this done automatically, with out me having to preprocess OpenSearch queries, to convert them into something "relationally correct" before sending them to OpenSearch? Because, you can easily imagine, this can get out of hand if you try to do it manually, both for queries like this one or for more complex ones...
I find it odd that no-one complains about this, since such complex queries are not so infrequent when it comes to search functionality. So, I wonder if I am missing something.
Upvotes: 0
Views: 805
Reputation: 4660
The answer is "no". You either have to transform the queries manually yourself or, if that is not possible, you have to create an algorithm that transforms the unusable queries into something that beaves like the original SQL query.
Upvotes: 0