Reputation:
Let's say we're building a chat app with Firestore, and we have a top-level "chat_rooms" and "users" collection. A natural way to structure the data would be to have an array of userIDs in a chat room document, like so:
In this way, querying all chat rooms for a particular userID would use a single query: collection("chat_rooms").whereField("userIDs", arrayContains: "userID") (with the Swift SDK). Similarly, querying all users in a chat room would use a single query: collection("users").whereField("userID", in: userIDs), where userIDs is an array of strings.
However, I'm a bit worried about this approach if the app were to scale to millions of users. According to the Firestore docs, "For each array field in a document, Cloud Firestore creates and maintains a collection-scope array-contains index." If we have millions of users, this would imply that the "userIDs" array field would contain millions of distinct values across the collection and that Firestore would create millions of indexes, one for each unique userID.
Is this assumption correct and, if so, what would be a better way to structure the data for scalability?
Upvotes: 0
Views: 32
Reputation: 317750
You're misunderstanding the statement:
For each array field in a document, Cloud Firestore creates and maintains a collection-scope array-contains index
This means that for each array field (not array field value), Firestore will create a single index for all the documents with that one named field. It will not create an index for each individual value being stored in that field across all the documents in the collection.
There is no scalability problem here if you just have a single named field which is an array of values, even if that field appears in millions of documents. Firestore can handle that with no issues.
Upvotes: 0