SabetiG
SabetiG

Reputation: 409

Run a Cron Job every 30mins after onCreate Firestore event

I want to have a cron job/scheduler that will run every 30 minutes after an onCreate event occurs in Firestore. The cron job should trigger a cloud function that picks the documents created in the last 30 minutes-validates them against a json schema-and saves them in another collection.How do I achieve this,programmatically writing such a scheduler? What would also be fail-safe mechanism and some sort of queuing/tracking the documents created before the cron job runs to push them to another collection.

Upvotes: 2

Views: 1324

Answers (3)

sceee
sceee

Reputation: 2163

Building a queue with Firestore is simple and fits perfectly for your use-case. The idea is to write tasks to a queue collection with a due date that will then be processed when being due.

Here's an example.

  1. Whenever your initial onCreate event for your collection occurs, write a document with the following data to a tasks collection:
    duedate: new Date() + 30 minutes
    type: 'yourjob'
    status: 'scheduled'
    data: '...' // <-- put whatever data here you need to know when processing the task

  1. Have a worker pick up available work regularly - e.g. every minute depending on your needs
// Define what happens on what task type
const workers: Workers = {
  yourjob: (data) => db.collection('xyz').add({ foo: data }),
}


// The following needs to be scheduled

export const checkQueue = functions.https.onRequest(async (req, res) => {
  // Consistent timestamp
  const now = admin.firestore.Timestamp.now();
  // Check which tasks are due
  const query = db.collection('tasks').where('duedate', '<=', new Date()).where('status', '==', 'scheduled');
  const tasks = await query.get();
  // Process tasks and mark it in queue as done
  tasks.forEach(snapshot => {
    const { type, data } = snapshot.data();
    console.info('Executing job for task ' + JSON.stringify(type) + ' with data ' + JSON.stringify(data));
    const job = workers[type](data)
      // Update task doc with status or error
      .then(() => snapshot.ref.update({ status: 'complete' }))
      .catch((err) => {
        console.error('Error when executing worker', err);
        return snapshot.ref.update({ status: 'error' });
      });

    jobs.push(job);
  });
  return Promise.all(jobs).then(() => {
    res.send('ok');
    return true;
  }).catch((onError) => {
    console.error('Error', onError);
  });
});

You have different options to trigger the checking of the queue if there is a task that is due:

  • Using a http callable function as in the example above. This requires you to perform a http call to this function regularly so it executes and checks if there is a task to be done. Depending on your needs you could do it from an own server or use a service like cron-job.org to perform the calls. Note that the HTTP callable function will be available publicly and potentially, others could also call it. However, if you make your check code idempotent, it shouldn't be an issue.
  • Use the Firebase "internal" cron option that uses Cloud Scheduler internally. Using that you can directly trigger the queue checking:
    export scheduledFunctionCrontab =
    functions.pubsub.schedule('* * * * *').onRun((context) => {
        console.log('This will be run every minute!');
        // Include code from checkQueue here from above
    });

Using such a queue also makes your system more robust - if something goes wrong in between, you will not loose tasks that would somehow only exist in memory but as long as they are not marked as processed, a fixed worker will pick them up and reprocess them. This of course depends on your implementation.

Upvotes: 3

Dmitri Borohhov
Dmitri Borohhov

Reputation: 1613

An easy way is that you could add a created field with a timestamp, and then have a scheduled function run at a predefined period (say, once a minute) and execute certain code for all records where created >= NOW - 31 mins AND created <= NOW - 30 mins (pseudocode). If your time precision requirements are not extremely high, that should work for most cases.

If this doesn't suit your needs, you can add a Cloud Task (Google Cloud product). The details are specified in this good article.

Upvotes: 1

Vikram Shinde
Vikram Shinde

Reputation: 1028

You can trigger a cloud function on the Firestore Create event which will schedule the Cloud Task after 30 minutes. This will have queuing and retrying mechanism.

Upvotes: 2

Related Questions