Lukas Bühler
Lukas Bühler

Reputation: 168

Why is it not possible to orderBy on different fields in Cloud Firestore and how can I work around it?

I have a collection in firebase cloud firestore called 'posts' and I want to show the most liked posts in the last 24h on my web app. The post documents have a field called 'like_count' (number) and another field called 'time_posted' (timestamp). I also want to be able to limit the results to apply pagination.

I tried to apply a filter to only get the posts posted in the last 24 hours and then ordering them by the 'like_count' and then the 'time_posted' since I want the posts with the most likes to appear first.

postsRef.where("time_posted", ">", twentyFourHoursAgo)
        .orderBy("like_count", "desc")
        .orderBy("time_posted", "desc")
        .limit(10)

However, I quickly found out that it is not possible to filter and then sort by two different fields. (See the Limitations part of the documentation for Order and limit data with Cloud Firestore)

It states:

Invalid: Range filter and first orderBy on different fields

I thought about sorting the results by 'like_count' in the frontend, but this won't work properly because I don't have all the documents. And getting all the documents is infeasible for a large number of daily posts.

Is there an easy work-around I am missing or how can I go about this?

Upvotes: 0

Views: 177

Answers (1)

samthecodingman
samthecodingman

Reputation: 26306

When performing a query, Firestore must be able to traverse an index in a continuous fashion.

This introduction video is a little outdated (because "OR" queries are now possible using the "in" operator) but it does give a good visualization of what Firestore is doing as it runs a query.

If your query was just postsRef.orderBy("like_count", "desc").limit(10), Firestore would load up the index it has for a descending "like_count", pluck the first 10 entries and return them.

To handle your query, it would have to pluck an entry off the descending "like_count" index, compare it to your "time_posted" requirement, and either discard it or add it to a list of valid entries. Once it has all of the recent posts, it then needs to sort the results as you specified. As these steps don't make use of a continuous read of an index, it is disallowed.

The solution would be to build your own index from the recent posts and then pluck the results off of that. Because doing this on the client is ill-advised, you should instead make use of a Cloud Function to do the work for you. The following code makes use of a Callable Cloud Function.

const MS_TWENTY_FOUR_HOURS = 24 * 60 * 60 * 1000;
export getRecentTopPosts = function.https.onCall((data, context) => {

  // unless otherwise stated, return only 10 entries
  const limit = Number(data.limit) || 10;

  const postsRef = admin.firestore().collection("posts");

  // OPTIONAL CODE SEGMENT: Check Cached Index

  const twentyFourHoursAgo = Date.now() - MS_TWENTY_FOUR_HOURS;
  const recentPostsSnapshot = await postsRef
    .where("time_posted", ">", twentyFourHoursAgo)
    .get();

  const orderedPosts = recentPostsSnapshot.docs
    .map(postDoc => ({
      snapshot: postDoc,
      like_count: postDoc.get("like_count"),
      time_posted: postDoc.get("time_posted")
    })
    .sort((p1, p2) => {
      const deltaLikes = p2.like_count - p1.like_count; // descending sort based on like_count
      if (deltaLikes !== 0) {
        return deltaLikes;
      }
      return p2.time_posted - p1.time_posted; // descending sort based on time_posted
    });

  // OPTIONAL CODE SEGMENT: Save Cached Index

  return orderedPosts
    .slice(0, limit)
    .map(post => ({
      _id: post.snapshot.id,
      ...post.snapshot.data()
    }));
})

If this code is expected to be called by many clients, you may wish to cache the index to save it getting constantly rebuilt by inserting the following segments into the function above.

// OPTIONAL CODE SEGMENT: Check Cached Index

if (!data.skipCache) { // allow option to bypass cache
  const cachedIndexSnapshot = await admin.firestore()
    .doc("_serverCache/topRecentPosts")
    .get();
  
  const oneMinuteAgo = Date.now - 60000;

  // if the index was created in the past minute, reuse it
  if (cachedIndexSnapshot.get("timestamp") > oneMinuteAgo) {    
    const recentPostMetadataArray = cachedIndexSnapshot.get("posts");
    const recentPostIdArray = recentPostMetadataArray
      .slice(0, limit)
      .map((postMeta) => postMeta.id)
    
    const postDocs = await fetchDocumentsWithId(postsRef, recentPostIdArray); // see https://gist.github.com/samthecodingman/aea3bc9481bbab0a7fbc72069940e527
    
    // postDocs is not ordered, so we need to be able to find each entry by it's ID
    const postDocsById = {};
    for (const doc of postDocs) {
      postDocsById[doc.id] = doc;
    }
    
    return recentPostIdArray
      .map(id => {
        // may be undefined if not found (i.e. recently deleted)
        const postDoc = postDocsById[id];

        if (!postDoc) {
          return null; // deleted post, up to you how to handle
        } else {
          return {
            _id: postDoc.id,
            ...postDoc.data()
          };
        }
      });
  }
}
// OPTIONAL CODE SEGMENT: Save Cached Index

if (!data.skipCache) { // allow option to bypass cache
  await admin.firestore()
    .doc("_serverCache/topRecentPosts")
    .set({
      timestamp: Date.now(),
      posts: orderedPosts
        .slice(0, 25) // cache the maximum expected amount
        .map(post => ({
          id: post.snapshot.id,
          like_count: post.like_count,
          time_posted: post.time_posted,
        }))
    });
}

Other improvements you could add to this function include:

  • A field mask - i.e. instead of return every part of the post documents, return just the title, like count, time posted and the author.
  • Variable post age (instead of 24 hours)
  • Variable minimum likes count
  • Filter by author

Upvotes: 1

Related Questions