Firestore where clause with big dataset

Question

I have the following structure in my firestore database:

messages:
    m1:
        title: "Message 1"
        ...
        archived: false
    m2:
        title: "Message 2"
        ...
        archived: true

Let's say I have 20k messages and I want to get archived messages using a "where" clause, will my query be slower than if I structured my database as following ?

nonArchivedMessages:
    m1:
        title: "Message 1"
        ...
archivedMessages:
    m2:
        title: "Message 2"
        ...

Using the second structure seems, to me, more adapted for large datasets but implies issues in some cases, such as getting a message without knowing whether it is archived or not.

Frank van Puffelen · Accepted Answer

One of the guarantees for Cloud Firestore is that the time it takes to retrieve a certain number of documents is not dependent on the total number of documents in the collection.

That means that in your first data model, if you load 100 archived documents and (for example) it takes 1 second, you know that it'll always take about 1 second to load 100 archived documents, no matter how many documents there are in the collection.

With that knowledge the only difference between your two data models is that in the first model you need a query to capture the archived messages, while in the second model you don't need a query. Queries on Cloud Firestore run by accessing an index, so the difference is that there is one (more) index being read in the first data model. While this has a minimal impact on the execution time, it is going to be insignificant compared to the time it takes to actually read the documents and return them to the client.

So: there may be other reasons to prefer the second data model, but the performance to read the archived messages is going to be the same between them.

Firestore where clause with big dataset

Answers (1)

Related Questions