AndreaNobili
AndreaNobili

Reputation: 42957

Why Firestore can't perform query having inequality conditions on different fields?

I am very new in Firestore database and I have no previous experience with NoSql database (I came from classical relational DBs)

I know that I can perform a query like this:

return this.db.collection(
      'courses',
        ref=>ref.where("seqNo", "==", "5")
                .where("lessonCount", ">=", 5)

that should retrieve all the object (from my courses collection) where the seqNo field is equal to 5 and where the lessonCount field have value >=5.

Ok, this query will work fine.

The thing that I am not understanding is why in Firebase some simple and pretty natural query are impossible to be implemented. For example a classical range query, changing the previous query something like this: I want retrieve all the courses having seqNo >= 5 and lessonCount >= 5:

return this.db.collection(
      'courses',
        ref=>ref.where("seqNo", ">=", "5")
                .where("lessonCount", ">=", 5)

I know that it is a tradeoff related to performance and related to the indexes.

But I can't understand why Firestore can't execute query where inequality conditions are no different filters

Why? How it works under the hood? What am I missing?

Upvotes: 0

Views: 380

Answers (2)

Luis Rita
Luis Rita

Reputation: 392

As of October 2020 inequalities have been implemented to firestore although with some limitations. Cloud Firestore now supports not equal queries

Upvotes: 0

Doug Stevenson
Doug Stevenson

Reputation: 317427

I'm sure someone has a much more comprehensive explanation than I do, but I think of it this way. Firestore makes a guarantee that queries execute in O(n) time where n is the number of matching documents from the query. In other words, queries scale according to the number of documents requested. Firestore was designed for this to be true in all (or nearly all) situations, and at massive scale (think billions of documents in a collection). As long as you have indexes that support the query, the query will nearly always perform well, and you never worry about scaling or sharding. It just works.

When you use a range query on a field, that requires an ascending or descending index on the field. The index knows the order of all the documents in the collection with respect to that field, and it can use that index to efficiently find the range of possible documents to match. However, when you introduce a second range on another field, that will require a completely different index with a completely different potential range match. These ranges cannot be merged, at Firestore's massive scale, and still give the performance guarantee stated above. It would have to load the entire set of documents that could match either range into memory (or some temp storate), the find an overlap between the two, then apply any other filters to it. Think about that for a moment - Firestore is simply unwilling to load billions of documents into memory to find range overlaps, for performance reasons.

It's not to say that Firestore couldn't do that at all, but that would drastically increase the magnitude of the problem. If it offered multiple range queries, I would expect that you'd pay pretty dearly for the computing resources to make that happen, and you'd not get its original performance guarantees any more.

For more flexible querying, you can mirror your data to BigQuery (there is even an extension for that), which will give you probably as much flexibility as you desire. Just don't expect that your queries be very fast, and be prepared to pay the costs of BigQuery's data warehousing features.

The upside of limiting to a single range query is that performance never degrades at scale. The downside is that you can't do everything you want to do. There are lots of workarounds that have been discussed on Stack Overflow if you want some ideas.

Upvotes: 3

Related Questions