Cheok Yan Cheng
Cheok Yan Cheng

Reputation: 42806

Having transaction without using high replication database

I have an application which performs the following task.

  1. Authenticate user through his email and password.
  2. Save the uploaded file in BlobStore.
  3. Check the user information from DataStore, to see whether there is an old blob associated with this user. If yes, delete the old blob from BlobStore.
  4. Update DataStore, to associate the newly blob in BlobStore, with this user.

I try to perform step 2, 3, 4 within a transaction.

db.run_in_transaction(self.upload, email, checksum, version, content)

However, as expected, since I am accessing more than 1 entity, I get the following error.

BadRequestError: can't operate on multiple entity groups in a single transaction.

I am not quite happy. As, what is the use of transaction, if it is unable to perform atomic operation across multiple tables (entity)?

I am force to use High Replication Database. (which will cost me, in term of billing)

db.run_in_transaction_options(xg_on, self.upload, email, checksum, version, content)

Again, I get the following error :

BadRequestError: Only ancestor queries are allowed inside transactions.

This happens on line :

blob_key = files.blobstore.get_blob_key(file_name)

My questions are :-

  1. Is there any way for us to perform transaction across multiple "tables", just like what I am able to do through PostgresSQL, without the need of using high replication datastore? Master/slave datastore will make me happy enough, as long as the cost is concern.
  2. What I can do to turn blob_key = files.blobstore.get_blob_key(file_name) into ancestor queries? So that it will work inside transaction? Or, in short, how I can make def upload works within a transaction?

I have the complete code as follow :


import urllib
import logging
import model
import zlib
from google.appengine.api import urlfetch
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import files
from google.appengine.ext import db
from google.appengine.ext import blobstore

xg_on = db.create_transaction_options(xg=True)


class Upload(webapp.RequestHandler):
    def post(self):
        email = self.request.get('Email')
        password = self.request.get('Passwd')
        checksum = int(self.request.get('Checksum'))
        version = int(self.request.get('Version'))
        logintoken = self.request.get('logintoken')
        logincaptcha = self.request.get('logincaptcha')
        content = self.request.get('file')

        if version == -1:
            self.response.out.write('ERROR [invalid parameter(s)]')
            return

        # Ensure the uploaded content is valid.
        if content is None or not content:
            self.response.out.write('ERROR [no file is uploaded]')
            return

        # Authentication.
        headers = {"Content-type": "application/x-www-form-urlencoded"}
        if logintoken and logincaptcha:
            form_data = urllib.urlencode({
                'accountType': 'HOSTED_OR_GOOGLE', 
                'Email': email,
                'Passwd': password,
                'service': 'mail',
                'source': 'JStock-1.05b',
                'logintoken': logintoken,
                'logincaptcha': logincaptcha
            })
        else:
            form_data = urllib.urlencode({
                'accountType': 'HOSTED_OR_GOOGLE', 
                'Email': email,
                'Passwd': password,
                'service': 'mail',
                'source': 'JStock-1.05b'
            })
        result = urlfetch.fetch(url='https://www.google.com/accounts/ClientLogin', payload=form_data, method=urlfetch.POST, headers={'Content-Type': 'application/x-www-form-urlencoded'})
        self.response.set_status(result.status_code)
        if result.status_code != 200:
            # Fail. Either incorrect password or captcha information required.
            self.response.out.write(result.content)
            return

        # OK! This is a valid user. Let's proceed with checksum verification.
        ##if checksum != zlib.adler32(content):
        ##    self.response.out.write('ERROR [fail in checksum]')
        ##    return            

        #db.run_in_transaction(self.upload, email, checksum, version, content)
        db.run_in_transaction_options(xg_on, self.upload, email, checksum, version, content)
        #self.upload(email, checksum, version, content)


    def upload(self, email, checksum, version, content):
        # Create the file
        file_name = files.blobstore.create(mime_type='application/octet-stream', _blobinfo_uploaded_filename=email)

        # Open the file and write to it
        with files.open(file_name, 'a') as f:
            f.write(content)

        # Finalize the file. Do this before attempting to read it.
        files.finalize(file_name)

        # Get the file's blob key
        blob_key = files.blobstore.get_blob_key(file_name)

        # Remove previous blob referenced by this human.
        query = model.Human.all()
        query.filter('email =', email)
        for q in query:
            blobstore.delete(q.content.key())

        human = model.Human(key_name=email, email=email, checksum=checksum, version=version, content=blob_key)
        human.put()


application = webapp.WSGIApplication([
    ('/upload.py', Upload)
], debug=True)


def main():
    run_wsgi_app(application)


if __name__ == '__main__':
    main()

Upvotes: 0

Views: 470

Answers (2)

Nick Johnson
Nick Johnson

Reputation: 101149

The HR datastore and M/S datastore are now the same price under the new billing. There's really no reason not to use the HR datastore.

get_blob_key has to do a query to find the blob corresponding to the filename. Do all of the work up to there outside the transaction, and only do the updates inside the transaction. Note, though, that nothing you can do will make this whole process transactional - because the blobstore updates themselves aren't.

Upvotes: 1

Dave W. Smith
Dave W. Smith

Reputation: 24966

I think I see what you're trying to accomplish by using a transaction: either create both the blobstore object and the datastore object (the Human), or create neither of them. But your approach is causing you several problems, one of which is that you can't do non-ancestor queries inside of transactions. You're seeing that when you do the get_blob_key, but you'll also get that querying for Humans. (The first error hides the second.) And then there's the problem of creating a whole new Human instead of updating an existing one, which is going to be left holding a key to a deleted blob.

The easiest way forward is to dispense with the transaction. Store the blob, then determine whether or not you know about this Human. If yes, delete the old blob, and update the Human. If no, create a new Human with the newly-stored blob key.

Upvotes: 1

Related Questions