Reputation: 4231
I plan to run a numerical simulation on Windows Azure. The simulation can take days or weeks. Every second or so the simulation produces a set of numbers like temperature: double
, pressure: double
, velocity: double[]
etc. which I would like to store.
The requirements are:
Which storage shall I use? Can you point me to a tutorial that discusses such a use-case?
Upvotes: 0
Views: 466
Reputation: 136369
My recommendation would be to use Azure Table Storage for your project. It's "dirt" cheap and is capable of storing massive amounts of data.
Coming to specific requirements:
To save all the data produced every second immediately and preferably in one request.
You could use Entity Group Transactions
to store the data in one request. There're some limitations around that so I would recommend that you read up on that.
To be able to read any of the stored data (using e.g. javascript) even during the numerical simulation runtime.
Since Windows Azure Table Storage is a REST based service, you could very well fetch the data using JavaScript as well though I would actually recommend using Shared Access Signatures
for querying data as it is much more secure.
To have temperature, pressure, velocity etc. separate. I would like to read e.g. all the pressures in one call without reading velocities etc.
On a global level, the storage should be split into projects and the projects should contain temperature "files", pressure "files" etc. and each "file" should contain a sequence of numbers.
This is where things get interesting. Basically what you're looking to do is de-normalize the data and Azure Table Storage is meant for that. What you call "file", I would call it a "table". So there will be a "temperature" table and "pressure" table and so on. The approach I would recommend is to save the data in a message in a Windows Azure Queue when you first collect it and then have another process (a worker role may be) pull this message and push the data in different tables by transforming the data required for each table.
It should be cheap.
Windows Azure Table Storage is cheap. You basically pay for the amount of data you store, number of transactions you perform against the service and the data which flows out of the data center. Please visit Windows Azure Pricing page for more details.
I do not need any advanced features -> it should behave more or less as files in the file-system
Azure Table Storage is essentially a key-value pair based data store so it's relatively easy to use.
Word of Caution
Azure Table Storage is a bit different than your regular SQL tables in the sense that you don't have the luxury of creating additional indexes (called secondary indexes) on a table. You only get a single index (on PartitionKey/RowKey) on a table. Thus it's very important that you must choose "PartitionKey/RowKey" values very wisely by taking how you're going to read the data back from the table into consideration.
You may find these links useful:
http://channel9.msdn.com/Events/Build/2012/4-004
Design of Partitioning for Azure Table Storage
Upvotes: 5