Lily
Lily

Reputation: 1

How to make sure to sequentially delete and insert records into Google Datastore by two Google Cloud Functions?

I created a cloud function to trigger files in Storage. Trigger type: Cloud Storage Event Type: Finalize/Create

Then if a file is uploaded into the bucket, the Cloud Function will insert a record (key: location) into a table_name in Datastore.

The demand is to update the record if the file is deleted. So, I created another Cloud Function with Event Type, 'Delete'. Another Cloud Function will copy the properties of the record to insert a new record (key: uuid) and delete the old one.

These functions works well. However, when I am replacing the homonymous file, it will trigger the 'Delete' Function and 'Finalize/Create' Function almost at the same time. What I found in table_name in datastore is only new record of (key: uuid), which means that I lost the record (key: location).

I am trying to add some code for delay when inserting the record (key: location) if uploaded. It always works but occasionally fails.

Or I need to know how to use transaction to make sure that delete action is first and then the insert action. I don't really know transaction.

The Cloud function (node.js8) for uploading files:

'use strict';
exports.upload = async (event, context) => {

    const processingFile = event.name;
    console.log(`  Created: ${event.timeCreated}`);

    let data = {
        property_a: '',
        property_b: '',
        location: processingFile
    };
    try {
        // await delay(5000);
        await insertData(data);
    } catch (err) {
        console.error(err);
    }
};

const delay = ms => new Promise(res => setTimeout(res, ms))

async function insertData(data) {
    const datastore = new Datastore({projectId: projectId, namespace: namespace});

    let name = data.location;
    const taskKey = datastore.key(['table_name', name]);

    // Prepares the new entity
    const task = {
        key: taskKey,
        data: data
    };
    // Saves the entity
    await datastore.save(task);
    console.log(`Save ${task.key.name}: ${task.data.location}`);
}

It does not always work and I don't want to wait for it. The cloud function(python3.7) for deleting files:

def delete(event, context):
    try:
        if:  # Exclude folders
            key_name = event['name']
            update(project_id, namespace, key_name)
    except Exception as e:
            print("Error: " + str(e))

def update(project_id, namespace, entityName):
    from google.cloud import datastore
    client = datastore.Client(project=project_id, namespace=namespace)

    import datetime
    with client.transaction():
        key = client.key('table_name', entityName)
        task = client.get(key)

        if not task:
            raise Exception("The entity does not exist.")

        # Copy the record
        # Insert a record(key: uuid) with auto-incrementing id
        incomplete_key = client.key('table_name')
        uuid_task = datastore.Entity(key=incomplete_key)
        data_properties = ['property_a', 'property_b', 'location']
        for data_property in data_properties:
            if data_property in task:
                uuid_task.update({
                    data_property : task[data_property]
                })
        client.put(uuid_task)

        # Delete the record(key: location) 
        client.delete(key)

The following record (key: location): It seems that I am copy the record and replace it and then delete the record. What I want to achieve is to copy the record and delete it and then to insert a new record.

Could you give me any suggestion?

Upvotes: 0

Views: 176

Answers (1)

Brandon Yarbrough
Brandon Yarbrough

Reputation: 38389

Every version of a Google Cloud Storage object has its own "generation" number. When you create a new object, your event will include the bucket name, the object name, and a generation number. When you overwrite that object with a new one, you'll get a delete notification about the older generation and a finalize notification about the newer generation.

The generation number, in your code, will be data["generation"]. I'd use it as part of your key, or as a precondition to changing the row.

Upvotes: 1

Related Questions