Anthony Compton
Anthony Compton

Reputation: 5361

Azure Table/Blob Storage, Data Versioning Patterns

I have some amount of data that is user-editable, read frequently and written occasionally. Currently, I'm thinking of either serializing it to JSON and storing it in Blob Storage on Azure or structuring it (in a different fashion than it is currently) and storing it in Table Storage.

My biggest concern is "upgrading" the data. For instance, today the MyDataObject class has some property MyCoolProperty that is complex (just meaning its not a primitive type). Tomorrow, I find out that the requirements have changed and now my object needs to contain a list of these complex objects, so now I have to find a way to upgrade my data to allow for this new requirement without breaking existing applications that may not be able to update simultaneously.

So what I'm really asking for is this: Are there any resources, editorials, frameworks, or best practices related to how to successfully keep moving your business forward while having easily (or relatively easily) responding to requirements changes.

Upvotes: 3

Views: 1784

Answers (1)

Jamie Thomas
Jamie Thomas

Reputation: 1523

Anthony, we had a similar issue while developing Cognito Forms, which runs exclusively on Azure Table Storage. We store all of our entities as JSON, using table storage properties as indexes for querying. While our versioning solution may not be a perfect fit for you, it has worked well for us, so I will describe it here:

  • We track version numbers with each entity
  • Each time we need to revise an entity of a given type, we create a Revision, which is simply a Func<string, string> that applies a transformation to the JSON and represents a specific new version of the type
  • Of course we make changes to the concrete types the JSON will be deserialized into before enabling the revision
  • Finally, when reading entities from storage, we compare the storage version with the current code version, applying the revisions to "catch up" the JSON
  • All entities when saved are always at the latest version

We then are able to upgrade on demand as entities are queried from table storage but are also able to query table storage for entities that are "out of date" to perform background updates. Since we have millions of entities, it is not practical or feasible to transform these as part of an update to a system that has no downtime. As our service tier servers are automatically upgraded by Azure to the latest version, they just start upgrading the JSON on the fly with no disruption of service.

In your specific case, it would seem like you could apply a similar methodology, but your would have to figure out a creative solution to handle multiple applications in parallel sharing the same data. Ultimately if two separate systems with separate release cycles are editing the same exact entities, no solution will completely protect you conflicts. However, you could store multiple versions and/or support both upgrade and downgrade revision logic.

Hope this helps!

Upvotes: 1

Related Questions