Sayo Oladeji
Sayo Oladeji

Reputation: 741

JDO - Persisting two entities with same key

I'm working on an AppEngine project and I'm using JDO on top of the AppEngine datastore for persistence. I have an entity that uses an encoded string as the key and also uses an application generated keyname (also a string). I did this because my app would frequently scoop data (potentially scooping the same thing) from the wild and attempt to persist them. In an attempt to avoid persisting several entities which essentially contain the same data, I decided to hash some properties about these data so as to get a consistent keyname (not manipulating keys directly because of entity relationships). The problem now is that whenever I calculate my hash (keyname) and attempt to store the entity, if it already exists in the datastore, the datastore (or JDO or whoever the culprit is) silently overwrites the properties of the entity in the datastore without raising any exception. This has serious effects on the app because it overrides the timeStamps (a field) of the entities (which we use for ordering). How best can I get around this?

Upvotes: 0

Views: 542

Answers (3)

Ajax
Ajax

Reputation: 2520

You need to do get-before-set (Check and set or CAS).

CAS is a fundamental tenant of concurrency, and it's a necessary evil of parallel computing.

Gets are much cheaper than sets anyway, so it may actually save you money.

Instead of blind writing to datastore, first retrieve; if the entity doesn't exist, catch the exception and just put the entity. If it does exist, do a deep compare before you save. If nothing has changed, don't persist it (and save that cost). If it has changed, choose your merge strategy however you please. One (slightly ugly) way to maintain dated revisions is to store the previous entity as a field in the updated entity (may not work for many revisions).

But, in this case, you have to get before set. If you don't expect many duplicates and want to be really chintzy, you can do an exists query first... Which is to do a keys-only count query on the key you want to use (costs 7x less than a full get). If (count() == 0) then put() else getAndMaybePut() fi

The count query syntax might look slow, but from my benchmarks, it's the fastest (and cheapest) possible way to tell if an entity exists:

public boolean exists(Key key){
    Query q;
    if (key.getParent() == null)
      q = new Query(key.getKind());
    else
      q = new Query(key.getKind(), key.getParent());
    q.setKeysOnly();
    q.setFilter(new FilterPredicate(
      Entity.KEY_RESERVED_PROPERTY, FilterOperator.EQUAL, key));
    return 1 == DatastoreServiceFactory.getDatastoreService().prepare(q)
      .countEntities(FetchOptions.Builder.withLimit(1));
}

Upvotes: 2

John Patterson
John Patterson

Reputation: 497

You must do a get() to see if an entity with the same key exists before you put() the new entity. There is no way around doing this.

You can use memcache and local "in-memory" caching to speed up your get() operation. This may only help if you are likely to read the same information multiple times. If not, the memcache query may actually slow down your process.

To ensure that two requests do not overwrite each other you should use a transaction (not possible with a query as suggested by Ajax unless you put all items in a single entity group which may limit your updates to 1 per second)

In pseudo code:

  1. Create Key from hashing data
  2. Check in-memory cache for key (use a ConcurrentHashSet of keys), return if found
  3. Check MemcacheService for key, return if found
  4. Start transaction
  5. Get entity from datastore, return if found
  6. Create entity in datastore
  7. Commit transaction, return if fails due to concurrent update
  8. Put Key in cache (in-memory and memcache)

Step 7 will fail if another request (thread) has already written the same key at the same time.

Upvotes: 1

Ankur Jain
Ankur Jain

Reputation: 1404

What I suggest you is that instead of saving the ID as a string either use a Long ID for your entity or you may use Key datatype, which is auto generated by appengine.

   @PersistenceCapable
   public class Test{
     @PrimaryKey
     @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
     private Long ID;

     // getter and setter 
   }

This will return a unique value to you everytime.

Upvotes: 0

Related Questions